๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ’ป Programming ๊ฐœ๋ฐœ

[๋ฒˆ์—ญ] How to Use Small Experiments to Develop a Caption Generation Model in Keras

by kimdee 2020. 3. 29.
๋ฐ˜์‘ํ˜•

'20 JUN : ์ง์ ‘ ๋ฒˆ์—ญํ•˜๋ฉฐ ์ž‘์—…ํ•œ colab ๋…ธํŠธ๋ถ ๋งํฌ๋ฅผ ์œ ์‹คํ•ด์„œ ์•„์ง ์ง„ํ–‰์ค‘์ž…๋‹ˆ๋‹ค. 

 

--- 

์› ์ถœ์ฒ˜ : https://machinelearningmastery.com/develop-a-caption-generation-model-in-keras/

By Jason Brownlee on Nov 24, 2017 / Last updated on Aug 7, 2019

 

How to Use Small Experiments to Develop a Caption Generation Model in Keras

Caption generation is a challenging artificial intelligence problem where a textual description must be generated for a photograph. It requires both methods from computer vision to understand the content of the image and a language model from the field of

machinelearningmastery.com

* ์•„๋ž˜ ๊ธ€์€ ์ œ์ด์Šจ ๋ธŒ๋ผ์šด๋ฆฌ์˜ "Keras์—์„œ ์บก์…˜ ์ƒ์„ฑ ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๋ฒ•"์„ ๋ฒˆ์—ญํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค. 

 

์บก์…˜ ์ƒ์„ฑ์€ ์‚ฌ์ง„์— ๋Œ€ํ•œ ์„ค๋ช…์„ ๋งŒ๋“ค์–ด๋‚ด๋Š”, ์–ด๋ ค์šด ์ธ๊ณต์ง€๋Šฅ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ด๋ฏธ์ง€์˜ ์ฝ˜ํ…ํŠธ๋ฅผ ์ดํ•ดํ•˜๋Š” ์ปดํ“จํ„ฐ ๋น„์ „๊ณผ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ ๋ถ„์•ผ์—์„œ ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ์ดํ•ดํ•œ ๋‚ด์šฉ์„ ์˜ฌ๋ฐ”๋ฅธ ์ˆœ์„œ์˜ ๋‹จ์–ด๋กœ ์น˜ํ™˜ํ•˜๋Š” ๋žญ๊ท€์ง€ ๋ชจ๋ธ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์ตœ๊ทผ ๋”ฅ๋Ÿฌ๋‹์œผ๋กœ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ์ตœ์‹ ์˜ ๊ฒฐ๊ณผ๋ฌผ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. 

 

์ง์ ‘ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์บก์…˜ ์ƒ์„ฑ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๋Š” ๊ฒƒ์€ ๋งค์šฐ ์–ด๋ ค์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฃผ๋กœ ๋ฐ์ดํ„ฐ์…‹๊ณผ ๋ชจ๋ธ์ด ๋„ˆ๋ฌด ํฌ๊ธฐ ๋•Œ๋ฌธ์— ํŠธ๋ ˆ์ธ์„ ํ•˜๋Š” ๋ฐ์— ์ˆ˜์ผ์ด ๊ฑธ๋ฆฌ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด์— ๋Œ€ํ•œ ๋Œ€์•ˆ์œผ๋กœ fuller ๋ฐ์ดํ„ฐ์…‹์˜ ์ž‘์€ ์ƒ˜ํ”Œ๋กœ ๋ชจ๋ธ configuration์„ ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค. 

 

์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ํ‘œ์ค€ ์‚ฌ์ง„ ์บก์…˜ ๋ฐ์ดํ„ฐ์…‹์˜ ์ ์€ ์ƒ˜ํ”Œ์„ ์ด์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ๋”ฅ๋ชจ๋ธ ๋””์ž์ธ์„ ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํŠœํ† ๋ฆฌ์–ผ์„ ๋๋‚ด๋ฉด ์–ด๋–ค ๊ฒƒ๋“ค์„ ์•Œ๊ฒŒ๋ ๊นŒ์š”?

 

* ์‚ฌ์ง„ ์บก์…˜ ์ƒ์„ฑ ๋ชจ๋ธ๋ง์„ ์œ„ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์ค€๋น„ํ•˜๋Š” ๋ฐฉ๋ฒ•

* ๋ชจ๋ธ์˜ ์Šคํ‚ฌ์„ ํ‰๊ฐ€ํ•˜๊ณ  ํ™•๋ฅ ์  ํŠน์„ฑ์„ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ๋ฒ ์ด์Šค๋ผ์ธ๊ณผ ํ…Œ์ŠคํŠธ ํ•˜๋„ค์Šค๋ฅผ ์„ค๊ณ„ํ•˜๋Š” ๋ฒ•

 * ๋ชจ๋ธ์˜ ์Šคํ‚ฌ, ํŠน์ง• ์ถ”์ถœ ๋ชจ๋ธ, ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ๋“ฑ ๋ชจ๋ธ์˜ ์Šคํ‚ฌ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ํ”„๋กœํผํ‹ฐ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•

 

ํ…์ŠคํŠธ ๋ถ„๋ฅ˜, ๋ฒˆ์—ญ, ์‚ฌ์ง„ ์บก์…˜ ์ƒ์„ฑ, ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๋Š” ๋งŽ์€ ๋ฐฉ๋ฒ•์„ ์ €์˜ ์ƒˆ ์ฑ…์—์„œ ๋งŒ๋‚˜๋ณด์„ธ์š”. 30๊ฐ€์ง€์˜ ๋‹จ๊ณ„๋ณ„ ํŠœํ† ๋ฆฌ์–ผ๊ณผ ์†Œ์Šค์ฝ”๋“œ๋„ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. 

https://machinelearningmastery.com/deep-learning-for-nlp/

 

Deep Learning For Natural Language Processing

Deep Learning for Natural Language Processing Develop Deep Learning Models for your Natural Language Problems Working with Text is… important, under-discussed, and HARD We are awash with text, from books, papers, blogs, tweets, news, and increasingly text

machinelearningmastery.com

 

์‚ฌ์ง„์ถœ์ฒ˜: https://www.flickr.com/photos/perry-pics/5968641588/ 

ํŠœํ† ๋ฆฌ์–ผ ๊ฐœ์š”

์ด ํŠœํ† ๋ฆฌ์–ผ์€ ์ด 6๊ฐœ์˜ ๋ถ€๋ถ„์œผ๋กœ ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค. 

1. ๋ฐ์ดํ„ฐ ์ค€๋น„

2. ๋ฒ ์ด์Šค ๋ผ์ธ ์บก์…˜ ์ƒ์„ฑ ๋ชจ๋ธ

3. ๋„คํŠธ์›Œํฌ ์‚ฌ์ด์ฆˆ ํŒŒ๋ผ๋ฏธํ„ฐ

4. ํ”ผ์ณ ์ถ”์ถœ ๋ชจ๋ธ Configuring

5. ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ

6. ๊ฒฐ๊ณผ ๋ถ„์„

 

ํŒŒ์ด์ฌ ํ™˜๊ฒฝ

Python3์™€ Scipy. Scikit-learn, Pandas, Numpy, Matplotlib 

Tensorflow์™€ Theano ๋ฐฑ์—”๋“œ, Keras(2.0์ด์ƒ) ์„ค์น˜๋˜์–ด์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. 

 

ํ™˜๊ฒฝ์„ค์ •์— ๋Œ€ํ•ด ๋„์›€์ด ํ•„์š”ํ•˜๋‹ค๋ฉด ์•„๋ž˜ ํŠœํ† ๋ฆฌ์–ผ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”. 

 

์•„๋‚˜์ฝ˜๋‹ค๋กœ ๋จธ์‹ ๋Ÿฌ๋‹๊ณผ ๋”ฅ๋Ÿฌ๋‹์„ ์œ„ํ•œ ํŒŒ์ด์ฌ ํ™˜๊ฒฝ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•

https://machinelearningmastery.com/setup-python-environment-machine-learning-deep-learning-anaconda/

 

How to Setup Your Python Environment for Machine Learning with Anaconda

It can be difficult to install a Python machine learning environment on some platforms. Python itself must be installed first and then there are many packages to install, and it can be confusing for beginners. In this tutorial, you will discover how to set

machinelearningmastery.com

 

GPU ํ™˜๊ฒฝ์—์„œ ์ฝ”๋“œ๋ฅผ ๋Œ๋ฆด ๊ฒƒ์„ ์ถ”์ฒœํ•ฉ๋‹ˆ๋‹ค. ์•„๋งˆ์กด ์›น์„œ๋น„์Šค์—์„œ ์ €๋ ดํ•œ ๊ฐ€๊ฒฉ์œผ๋กœ GPU์— ์ ‘์†ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ํŠœํ† ๋ฆฌ์–ผ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”. 

 

์•„๋งˆ์กด ์›น์„œ๋น„์Šค์—์„œ ์ผ€๋ผ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋Œ€๊ทœ๋ชจ์˜ ๋”ฅ๋Ÿฌ๋‹๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•˜๊ณ  ๊ฐœ๋ฐœํ•˜๋Š” ๋ฐฉ๋ฒ•

https://machinelearningmastery.com/develop-evaluate-large-deep-learning-models-keras-amazon-web-services/

 

How to Train Keras Deep Learning Models on AWS EC2 GPUs (step-by-step)

Keras is a Python deep learning library that provides easy and convenient access to the powerful numerical libraries like TensorFlow. Large deep learning models require a lot of compute time to run. You can run them on your CPU but it can take hours or day

machinelearningmastery.com

 

1. ๋ฐ์ดํ„ฐ ์ค€๋น„

 

๋ชจ๋ธ์„ ํŠธ๋ ˆ์ด๋‹ ํ•˜๊ธฐ ์œ„ํ•ธ ๋ฐ์ดํ„ฐ์…‹์„ ๋จผ์ € ์ค€๋น„ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•ฝ 8,000์žฅ์ด ๋„˜๋Š” ์‚ฌ์ง„๊ณผ ์„ค๋ช…์œผ๋กœ ๊ตฌ์„ฑ๋œ Flickr8K ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•ด ๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์…‹์€ ์•„๋ž˜์—์„œ ๋‹ค์šด๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 

 

 * ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ…์ŠคํŠธ  (์—…๋ฐ์ดํŠธ : ๊ณต์‹ ์‚ฌ์ดํŠธ์—์„œ ๋งํฌ๊ฐ€ ๋‚ด๋ ค๊ฐ€์„œ ์ €์ž์˜ ๊นƒํ—ˆ๋ธŒ ๋ ˆํฌ์—์„œ ๋ฐ”๋กœ ๋‹ค์šด๋ฐ›์„ ์ˆ˜ ์žˆ๋Š” ๋งํฌ๋ฅผ ์ฒจ๋ถ€ํ•ฉ๋‹ˆ๋‹ค.)

https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip

https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip

 

์ž‘์—…ํ•˜๊ณ  ์žˆ๋Š” ๋””๋ ‰ํ† ๋ฆฌ์— ์‚ฌ์ง„๊ณผ ์„ค๋ช… ๋ฐ์ดํ„ฐ์˜ ์••์ถ•์„ ํ’€๊ณ  ๊ฐ๊ฐ  Flicker8k_Dataset,  Flickr8k_text ํด๋”์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. 

 

๋ฐ์ดํ„ฐ ์ค€๋น„์—๋Š” ์‚ฌ์ง„๊ณผ ํ…์ŠคํŠธ ๊ฐ๊ฐ ๋‘๊ฐ€์ง€ ๊ณผ์ •์ด ์žˆ์Šต๋‹ˆ๋‹ค. 

 

1-1 ํ…์ŠคํŠธ ์ค€๋น„ (๊ดœ์ฐฎ์€ ํ‘œํ˜„์ด ์žˆ๋‹ค๋ฉด ๋Œ“๊ธ€๋กœ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.) 

๋ฐ์ดํ„ฐ์…‹์—๋Š” ๊ฐ ์‚ฌ์ง„์— ๋Œ€ํ•œ ์—ฌ๋Ÿฌ ๋””์Šคํฌ๋ฆฝ์…˜์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์„ค๋ช… ํ…์ŠคํŠธ์— ์•ฝ๊ฐ„์˜ ์ •์ œ๊ณผ์ •์„ ๊ฑฐ์น  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋จผ์ € ๋””์Šคํฌ๋ฆฝ์…˜์ด ํฌํ•จ๋œ ํŒŒ์ผ์„ ๋ชจ๋‘ ๋ถˆ๋Ÿฌ์˜ฌ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

# load doc into memory ๋ฌธ์„œ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ๋กœ๋“œ
def load_doc(filename):
	# open the file as read only ํŒŒ์ผ์„ ์ฝ๊ธฐ์ „์šฉ์œผ๋กœ ์—ด๊ธฐ
	file = open(filename, 'r')
	# read all text ๋ชจ๋“  ํ…์ŠคํŠธ๋ฅผ ์ฝ๊ธฐ
	text = file.read()
	# close the file ํŒŒ์ผ ๋‹ซ๊ธฐ
	file.close()
	return text

filename = 'Flickr8k_text/Flickr8k.token.txt'
# load descriptions ๋””์Šคํฌ๋ฆฝ์…˜ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
doc = load_doc(filename)

 

๊ฐ ์‚ฌ์ง„์—๋Š” ๊ณ ์œ ์˜ ์‹๋ณ„์ž๊ฐ€ ์žˆ๊ณ , ์‚ฌ์ง„ ํŒŒ์ผ์˜ ์ด๋ฆ„๊ณผ ์„ค๋ช… ํ…์ŠคํŠธ ํŒŒ์ผ์—์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์œผ๋กœ ์‚ฌ์ง„์— ๋Œ€ํ•œ ์„ค๋ช…  ๋ชฉ๋ก์„ ์‚ดํŽด๋ณด๊ณ , ๊ฐ ์‚ฌ์ง„์— ๋Œ€ํ•œ ์ฒซ๋ฒˆ์งธ ์„ค๋ช…์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. 

 

์•„๋ž˜์—์„œ๋Š” load_descriptions() ๋ผ๋Š” ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•˜์—ฌ, ๋กœ๋“œ๋œ ๋ฌธ์„œ ํ…์ŠคํŠธ๊ฐ€ ์ฃผ์–ด์งˆ ๋•Œ ๋””์Šคํฌ๋ฆฝ์…˜์—, ์‚ฌ์ง„ ์‹๋ณ„์ž๋ฅผ ๋”•์…”๋„ˆ๋ฆฌ ํ˜•ํƒœ๋กœ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ณผ์ •์„ ๊ฑฐ์น  ๊ฒƒ์ž…๋‹ˆ๋‹ค. 

 

# extract descriptions for images
def load_descriptions(doc):
	mapping = dict()
	# process lines
	for line in doc.split('\n'):
		# split line by white space
		tokens = line.split()
		if len(line) < 2:
			continue
		# take the first token as the image id, the rest as the description
		image_id, image_desc = tokens[0], tokens[1:]
		# remove filename from image id
		image_id = image_id.split('.')[0]
		# convert description tokens back to string
		image_desc = ' '.join(image_desc)
		# store the first description for each image
		if image_id not in mapping:
			mapping[image_id] = image_desc
	return mapping

# parse descriptions
descriptions = load_descriptions(doc)
print('Loaded: %d ' % len(descriptions))

 

1-2 ์‚ฌ์ง„ ์ค€๋น„ 

 

๋ฐ˜์‘ํ˜•

'๐Ÿ’ป Programming ๊ฐœ๋ฐœ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[iOS์•ฑ๊ฐœ๋ฐœ] ํŒจ์ŠคํŠธ์บ ํผ์Šค ๊ฐ•์˜ 0์ฃผ์ฐจ, ํ˜ผ์ž์„œ ์˜จ๋ผ์ธ์œผ๋กœ ๊ณต๋ถ€ํ•˜๊ธฐ, ๋‹ค์งํ•˜๋Š” ์ธ๊ฐ„์—์„œ ๋ฒ—์–ด๋‚˜๊ธฐ  (4) 2021.09.30
[C์–ธ์–ด] ๋‚ด๊ฐ€ ๋ณด๋ ค๊ณ  ์ •๋ฆฌํ•œ C์–ธ์–ด ๊ณต๋ถ€ํ•˜๊ธฐ ์ข‹์€ ์ฑ…๊ณผ ์‚ฌ์ดํŠธ ์ถ”์ฒœ + ์ง์ ‘ ๊ณต๋ถ€, ์ด์šฉํ•ด๋ณด๊ณ  ์ •๋ฆฌํ•œ ๋‚ด์šฉ + C์–ธ์–ด๋ฅผ ์‹œ์ž‘ํ•˜๋Š” ์ดˆ๋ณด์ž์—๊ฒŒ ์ถ”์ฒœ  (0) 2021.06.26
[์ธ๊ณต์ง€๋Šฅ] ํ† ์ต1์œ„์•ฑ, AI ํ† ์ต ํŠœํ„ฐ ์‚ฐํƒ€ํ† ์ต์˜ ํ˜„์ง์ž ์„ธ๋ฏธ๋‚˜ ํ›„๊ธฐ  (0) 2021.06.25
๊ฒฐ์ œํ•œ/ํ•  or ์ˆ˜๊ฐ•ํ•œ/ํ•  ์˜จ๋ผ์ธ ์ˆ˜์—…๋“ค ๋ชฉ๋ก ์ •๋ฆฌ/ํ›„๊ธฐ๋งํฌ๋„ ์—ฐ๊ฒฐ  (0) 2019.11.02
#0 ๊ฐœ๋ฐœํ™˜๊ฒฝ ์„ค์ •ํ•˜๊ธฐ - ํ…์ŠคํŠธ์—๋””ํ„ฐ, Git, Node.js, NPM  (0) 2019.08.25

๋Œ“๊ธ€