딥러닝을 활용한 이미지 처리

728x90

딥러닝을 활용한 이미지 처리

1.환경설정

- 아나콘다 다운로드 설치 /

- 가상환경 생성

=> anaconda prompt 관리자 권한으로 실행

=> pip 최신 버전으로 upgrade

=> python -m pip install --upgrade pip

=> 개발 가상환경을 하나 생성

=> python 버전을 3.7버전으로 가상환경을 생성

=> conda create -n cpu_env python=3.7 openssl

=> conda info --env

=> 생성한 가상환경으로 전환

activate cpu_env

=> 개발툴은 웹 기반의 IDE 인 jupyter notebook

=> conda install nb_conda

=> 환경설정파일 생성

=> jupyter notebook --generate-config

notebook_dir # 파일 저장 위치 지정

=> working directory 설정

=> jupyter notebook 실행과 사용

=> 가상환경 선택

ctrl + enter # 실행

b # 아래에 새로운 cell

a # 위에 새로운 cell

dd # cell 삭제

# 실행 횟수,

# cell 간 메모리 공유

2. python 언어와 numpy

# python 주석 # ( 한줄 주석 )

'''
이 내용도 주석 ( 여러줄 주석 )
'''

# python 의 built-in types
# 1. numeric ( 숫자 타입 )
# - int, float, complex ( class )

# python 의 built-in types

# 1. numeric ( 숫자 타입 )
# - int, float, complex ( class )
my_var1 = 100
my_var2 = 3.14

# 2. sequence type ( list, tuple, range )
# - list : [] 표현, 모든 타입 가능
my_list = [1, 2, 'Hello', 3.14, True]
# - tuple : list 와 유사, () 표현, 내용을 변경하거나 삭제할 수 없음, readonly
my_tuple = (1, 2, 'Hello', 3.14, True)
# 요소가 1개인 tuple
my_tuple = (100,)
# - range : 숫자 집합을 표현하는 자료구조
my_range = range(100) # 0부터 99까지 1씩 증가하는 숫자집합
for step in range(100):
pass

# 3. 문자열 => texxt sequence type ( str class )
my_str1 = "이것은 소리없는 아우성!!"
my_str2 = 'Hello World!'

# 문자열 표현 양식
# '나는 사과를 3개, 바나나를 5개 가지고 있어요!'
num_apple = 3
num_banana = 5
print('나는 사과를 {}개, 바나나를 {}개 가지고 있어요!'.format(num_apple, num_banana))

# 4. Map => Mapping type , dictionary ( dict class)

my_dict = {"name": "홍길동"}

# python => numpy, pandas
#1. numpy : numerical python, 벡터와 행렬 연산에 최적화

# list의 연산
my_list1 = [1,2,3]
my_list2 = [4,5,6]

print(my_list1 + my_list2)

activate cpu_env

conda install numpy

jupyter notebook

# numpy를 설치한 후 import를 이용해서 모듈을 불러들임

import numpy

my_arr1 = numpy.array([1,2,3]) # 수치적 벡터로 변경

print(my_list1)

print(my_arr1)

# numpy를 설치한 후 import를 이용해서 모듈을 불러들임

import numpy

my_arr1 = numpy.array([1,2,3]) # 수치적 벡터로 변경

my_arr2 = numpy.array([4,5,6])

print(my_arr1 + my_arr2)

=> python, numpy, pandas

AI <- Machine Learning <- Deep Learning

Explicit programming 의 한계, if 문으로 해결이 안됨

Machine Learning: 예측 vs 분석

- UnSupervised Learning, 결과 데이터가 없음, 그림1(고양이그림), 그림2(호랑이그림), 군집, 집단

- Supervised Learning, 결과 데이터가 있음, 그림1(고양이그림)-고양이, 그림2(호랑이그림)-호랑이, y측 label

-> 데이터 종류에 따른 학습 타입:

-> 1. linear regression: 선형적인 관계에있어 데이터에 기반한 예측, y측 label 넓음, 7시간 공부하면 몇 점

-> 2. logistic regression: y측 label이 2개, 7시간 공부하면 합격/불합격

-> 3. multinomial classification: y측 label이 정해진 개수, 7시간 공부하면 어떤 grade(A,B,C,F학점)를 받는가

Linear Hypothesis, 가설, 가정, 2차원 평면상의 직선, y=ax+b, 예측모델, H(x) = Wx + b, weight, bias

-> cost function의 값을 최소로 만드는 W와 b의 값을 찾아야 함, 최소값은 0

Tensorflow

1. node, 데이터의 입력과 출력, 연산

2. edge

3. Tensor, 다차원행렬, 데이터, 방향성

=> tensorflow 설치

# tensorflow 기본 사용법

# google 의 tensorflow 설치, tensorflow 1.15 버전 설치

# conda install tensorflow=1.15

# module loading

import tensorflow as tf

node1 = tf.constant("Hello World") # 상수노드
print(node1)

개념

node1 = tf.constant("Hello World") # 상수노드

sess = tf.Session() # session 생성 ( runner )

print(node1) # 실행하지 않고 노드 출력

print(sess.run(node1)) # 실행하고 결과를 출력, tensor출력

node1 = tf.constant("Hello World") # 상수노드

sess = tf.Session() # session 생성 ( runner )

print(node1) # 실행하지 않고 노드 출력

print(sess.run(node1).decode()) # 실행하고 결과를 출력, tensor출력

개념

node1 = tf.constant(1, dtype=tf.float32) # tensor 1

node2 = tf.constant(2, dtype=tf.float32) # tensor 2

node3 = node1 + node2 # + 연산, edge 연결,

sess = tf.Session() # 대소문자 구분

print(sess.run(node3)) # node3 실행

node1 = tf.constant(1, dtype=tf.float32) # tensor 1

node2 = tf.constant(2, dtype=tf.float32) # tensor 2

node3 = node1 + node2 # + 연산, edge 연결,

sess = tf.Session() # 대소문자 구분

print(sess.run([node2, node3])) # node2, node3 실행

개념

node1 = tf.placeholder(dtype=tf.float32)

node2 = tf.placeholder(dtype=tf.float32)

node3 = node1 + node2

sess = tf.Session()

print(sess.run(node3, feed_dict={node1 : 5,

node2 : 10})) # 먹이를 주다, 데이터 밀어 넣음

개념

H = Wx + b

개념

Gradient descent algorithm, 경사 하강법

# tensorflow linear regression 구현

import tensorflow as tf

# training data set, 학습을 위한 데이터

x = [1, 2, 3]

y = [1, 2, 3]

# Weight & bias

W = tf.Variable(tf.random_normal([1]),

name='weight') # 변수, 값이 변함,

#표준정규분포에서 난수 발생,

# [1] 1차원이고 값이 1개,

# [2,3] -> 2차원, 총 6개

# [3,2,8] -> 3열, 2행, 8열

b = tf.Variable(tf.random_normal([1]),

name='bias')

# Hypothesis, 가설

H = W * x + b # 직선

# W와 b를 구하기 위해

# cost function을 정의, 최소제곱법, 제곱의 평균

cost = tf.reduce_mean(tf.square(H - y)) # 평균구함, 두개의 차에 제곱,

# H 가설, y 데이터

# cost 까지 정의

# cost 함수의 값이 최소가 되는 W와 b를 구함, 특정 지점에서 미분 반복하여 W 구함

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

train = optimizer.minimize(cost) # W를 찾음, 1회, 반복해야함

# session

sess = tf.Session() # 실행위해

# 초기화 ( tf.Variable 이 있어서 )

sess.run(tf.global_variables_initializer())

# 학습, W, b가 계속 변경 됨

for step in range(3000):

_, cost_val = sess.run([train, cost])

if step % 300 == 0: # 300번째마다 출력

print("cost : {}".format(cost_val))

#train_val, cost_val = sess.run([train, cost]) # train 은 W, b를 구하기 위해,

#출력 불필요

# cost 는 출력

print(sess.run(W))

print(sess.run(b))

# H = 0.99959624 * x + 0.00091766

# tensorflow linear regression 구현

import tensorflow as tf

# training data set, 학습을 위한 데이터

x_data = [1, 2, 3]

y_data = [1, 2, 3]

# placeholder 를 설정, 입력 값 지정

x = tf.placeholder(dtype=tf.float32)

y = tf.placeholder(dtype=tf.float32)

# Weight & bias

W = tf.Variable(tf.random_normal([1]), name='weight')

b = tf.Variable(tf.random_normal([1]), name='bias')

# Hypothesis, 가설

H = W * x + b # 직선, H를 구하는 것이 목적, H를 실행, x 값을 밀어 넣음, 먹이

# W와 b를 구하기 위해

# cost function을 정의, 최소제곱법, 제곱의 평균

cost = tf.reduce_mean(tf.square(H - y)) # 평균구함, 두개의 차에 제곱,

# H 가설, y 데이터

# cost 까지 정의

# cost 함수의 값이 최소가 되는 W와 b를 구함, 특정 지점에서 미분 반복하여 W 구함

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

train = optimizer.minimize(cost) # W를 찾음, 1회, 반복해야함

# session

sess = tf.Session() # 실행위해

# 초기화 ( tf.Variable 이 있어서 )

sess.run(tf.global_variables_initializer())

# 학습, W, b가 계속 변경 됨

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={x:x_data,

y:y_data})

if step % 300 == 0: # 300번째마다 출력

print("cost : {}".format(cost_val))

# predict , 예측

print(sess.run(H, feed_dict={x:7}))

데이터 정제가 필요

activate cpu_env
jupyter notebook

일차원 직선에서 다차원으로

dot product, 행렬 곱, 앞의 열과 뒤에 행의 개수가 같아야 함

matrix hypothesis

# tensorflow 를 이용해서 선형회귀 , linear regression

# x 측 입력데이터가 1개가 아니라 여러개로 구성, 2차원 메트릭스

import tensorflow as tf

# training data set , 학습 데이터 셋

x_data = [[73,80,75], # x축 5행 3열 5 x 3, 행은 변할 수 있지만 열은 바뀌지 않음

[93,88,93],

[89,91,90],

[96,98,100],

[73,66,70]]

y_data = [[152],[185],[180],[196],[142]] # y축 5행 1열 5 x 1

# placeholder, 입력 데이터

X = tf.placeholder(shape=[None,3], dtype=tf.float32) # 열만 3열로 맞춤

Y = tf.placeholder(shape=[None,1], dtype=tf.float32) # 열만 1열로 맞춤

# Weight & bias 정하기

# W 5 x 3 => 3 x 1 => 5 x 1

W = tf.Variable(tf.random_normal([3,1]), name='weight') # 값이 변하는 변수,

# W1 개 = [1], 3행 1열

b = tf.Variable(tf.random_normal([1]), name='bias')

# Hypothesis, 가설

# H = W * x + b

H = tf.matmul(X, W) + b

# cosr 함수를 생성해서 이걸 최소화시키는 W와 b를 구함, 최소제곱법

cost = tf.reduce_mean(tf.square(H-Y)) # 평균을 구해 제곱 후 H 에서 Y를 뺌

# train

optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) # 미분해서1*2에-5승

train = optimizer.minimize(cost) # W를 줄임

# session, 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 학습 진행

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0: # 300번째마다 출력

print("cost : {}".format(cost_val))

conda install pandas

# Ozone량 예측을 위한 예제

# Multi-variable Linear Regression

import tensorflow as tf

import pandas as pd

# data loading

data = pd.read_csv('./data/ozone.csv')

display(data.head())

# training data set

# Ozone량 예측을 위한 예제

# Multi-variable Linear Regression

import tensorflow as tf

import pandas as pd

# data loading

data = pd.read_csv('./data/ozone.csv')

# display(data.head())

# raw data 정제, 학습에 적합한 데이터로 정제

# 결측치, 이상치 제거

data = data.dropna(how='any') # na data drop

# display(data.head())

# 필요한 column만 추출

data = data[['Ozone', 'Solar.R', 'Wind', 'Temp']]

# display(data.head())

# training data set

x_data = data[['Solar.R', 'Wind', 'Temp']]

x_data = x_data.values # 값만 추출

#display(x_data.head())

#print(x_data[0:5,:]) # numpy의 slicing, 위에서 5개의 행

y_data = data[['Ozone']]

y_data = y_data.values # 값만 추출

#display(y_data.head())

#print(x_data)

# tensorflow 구현

# 1. placeholder

X = tf.placeholder(shape=[None,3], dtype=tf.float32) # 열은 3개인 2차원의 데이터

Y = tf.placeholder(shape=[None,1], dtype=tf.float32)

# 2. Weight & bias 설정

W = tf.Variable(tf.random_normal([3,1]), name='weight') # W 는 X와 Y에 의존적

b = tf.Variable(tf.random_normal([1]), name='bias')

# 3. Hypothesis 가설

H = tf.matmul(X, W) + b

# 4. cost 함수, 최소제곱법

cost = tf.reduce_mean(tf.square(H-Y)) # H에서 Y를 뺀것의 제곱의 평균

# 5. cost 가 최소가 되는 W, b를 구함, train

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# 6. 실행위해 session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 7. train, 학습, 반복

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0:

print("cost : {}".format(cost_val))

발산 됨, 망함, 학습이 안됨

=> 각 data 의 단위, range 가 다름

=> 값의 범위를 비율로 맞춰야 함

=> 모든 값을 0과 1사이의 값으로 바꿈

=> 컬럼별로 normalization

colum normalization sklearn, scikit-learn

activate cpu_env
pip install sklearn

scale_x = MinMaxScaler()

scale_y = MinMaxScaler()

scale_x.fit(x_data) # scale 정함, x_data에 대한 설정을 잡음

# scale_x 에 x 측 최대, 최소 값을 구함

x_data = scale_x.transform(x_data) # 현재 fitting 된 값을 이용해서 x_data를 변환

# 0 과 1 사이 값으로 변경

print(x_data[0:5,:])

# Ozone량 예측을 위한 예제

# Multi-variable Linear Regression

import tensorflow as tf

import pandas as pd

# Normalization 을 하기 위해

from sklearn.preprocessing import MinMaxScaler # 패키지 안의 모듈 사용

# data loading

data = pd.read_csv('./data/ozone.csv')

# display(data.head())

# raw data 정제, 학습에 적합한 데이터로 정제

# 결측치, 이상치 제거

data = data.dropna(how='any') # na data drop

#display(data.head())

# 필요한 column만 추출

data = data[['Ozone', 'Solar.R', 'Wind', 'Temp']]

# display(data.head())

# training data set

x_data = data[['Solar.R', 'Wind', 'Temp']] # 입력

x_data = x_data.values # 값만 추출

#display(x_data.head())

#print(x_data[0:5,:]) # numpy의 slicing, 위에서 5개의 행

y_data = data[['Ozone']] # 출력

y_data = y_data.values # 값만 추출

#display(y_data.head())

#print(x_data)

scale_x = MinMaxScaler()

scale_y = MinMaxScaler()

scale_x.fit(x_data) # scale 정함, x_data에 대한 설정을 잡음

# scale_x 에 x 측 최대, 최소 값을 구함

scale_y.fit(y_data)

x_data = scale_x.transform(x_data) # 현재 fitting 된 값을 이용해서 x_data를 변환

# 0 과 1 사이 값으로 변경

#print(x_data[0:5,:])

y_data = scale_y.transform(y_data)

# tensorflow 구현

# 1. placeholder

X = tf.placeholder(shape=[None,3], dtype=tf.float32) # 열은 3개인 2차원의 데이터

Y = tf.placeholder(shape=[None,1], dtype=tf.float32)

# 2. Weight & bias 설정

W = tf.Variable(tf.random_normal([3,1]), name='weight') # W 는 X와 Y에 의존적

b = tf.Variable(tf.random_normal([1]), name='bias')

# 3. Hypothesis 가설

H = tf.matmul(X, W) + b

# 4. cost 함수, 최소제곱법

cost = tf.reduce_mean(tf.square(H-Y)) # H에서 Y를 뺀것의 제곱의 평균

# 5. cost 가 최소가 되는 W, b를 구함, train

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# 6. 실행위해 session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 7. train, 학습, 반복

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0:

print("cost : {}".format(cost_val))

# 8. 예측, prediction

input_data = [[149, 12.6, 74]] # X 입력할 값, H 구할 값, 태양광, 바람, 온도

# X = tf.placeholder(shape=[None,3], dtype=tf.float32) , 2 차원

scale_input_data = scale_x.transform(input_data) # 0과 1사이로 scale

result = sess.run(H, feed_dict={X:scale_input_data}) # scale 된 결과

print(result)

# 8. 예측, prediction

input_data = [[149, 12.6, 74]] # X 입력할 값, H 구할 값, 태양광, 바람, 온도

# X = tf.placeholder(shape=[None,3], dtype=tf.float32) , 2 차원

scale_input_data = scale_x.transform(input_data) # 0과 1사이로 scale

result = sess.run(H, feed_dict={X:scale_input_data}) # scale 된 결과

print(scale_y.inverse_transform(result)) # 원래 상태 값으로

# Ozone량 예측을 위한 예제

# Multi-variable Linear Regression

import tensorflow as tf

import pandas as pd

# Normalization 을 하기 위해

from sklearn.preprocessing import MinMaxScaler # 패키지 안의 모듈 사용

# data loading

data = pd.read_csv('./data/ozone.csv')

# display(data.head())

# raw data 정제, 학습에 적합한 데이터로 정제

# 결측치, 이상치 제거

data = data.dropna(how='any') # na data drop

#display(data.head())

# 필요한 column만 추출

data = data[['Ozone', 'Solar.R', 'Wind', 'Temp']]

# display(data.head())

# training data set

x_data = data[['Solar.R', 'Wind', 'Temp']] # 입력

x_data = x_data.values # 값만 추출

#display(x_data.head())

#print(x_data[0:5,:]) # numpy의 slicing, 위에서 5개의 행

y_data = data[['Ozone']] # 출력

y_data = y_data.values # 값만 추출

#display(y_data.head())

#print(x_data)

scale_x = MinMaxScaler()

scale_y = MinMaxScaler()

scale_x.fit(x_data) # scale 정함, x_data에 대한 설정을 잡음

# scale_x 에 x 측 최대, 최소 값을 구함

scale_y.fit(y_data)

x_data = scale_x.transform(x_data) # 현재 fitting 된 값을 이용해서 x_data를 변환

# 0 과 1 사이 값으로 변경

#print(x_data[0:5,:])

y_data = scale_y.transform(y_data)

# tensorflow 구현

# 1. placeholder

X = tf.placeholder(shape=[None,3], dtype=tf.float32) # 열은 3개인 2차원의 데이터

Y = tf.placeholder(shape=[None,1], dtype=tf.float32)

# 2. Weight & bias 설정

W = tf.Variable(tf.random_normal([3,1]), name='weight') # W 는 X와 Y에 의존적

b = tf.Variable(tf.random_normal([1]), name='bias')

# 3. Hypothesis 가설

H = tf.matmul(X, W) + b

# 4. cost 함수, 최소제곱법

cost = tf.reduce_mean(tf.square(H-Y)) # H에서 Y를 뺀것의 제곱의 평균

# 5. cost 가 최소가 되는 W, b를 구함, train

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# 6. 실행위해 session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 7. train, 학습, 반복

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0:

print("cost : {}".format(cost_val))

# 8. 예측, prediction

input_data = [[149, 12.6, 74]] # X 입력할 값, H 구할 값, 태양광, 바람, 온도

# X = tf.placeholder(shape=[None,3], dtype=tf.float32) , 2 차원

scale_input_data = scale_x.transform(input_data) # 0과 1사이로 scale

result = sess.run(H, feed_dict={X:scale_input_data}) # scale 된 결과

print(scale_y.inverse_transform(result)) # 원래 상태 값으로

# Logistic regression, 둘 중에 하나, y 측 label 이 1개라서

Linear Regression 은 H 값이 큼 => 직선

x 값이 큰 값이 들어와도 H 값을 0과 1 사이로 한정함 => 곡선 => Logistic Regression => sigmoid function

sigmoid function

=> 가설 H 를 sigmoid 로 대체

cost(W,b) , 최소제곱법 => cross entropy cost function

# Logistic regression, Binary Classification

import tensorflow as tf

# training data set

x_data = [[1,0],

[2,0],

[5,1],

[2,3],

[3,3],

[8,1],

[10,0]]

y_data = [[0],[0],[0],[1],[1],[1],[1]]

# placeholder

X = tf.placeholder(shape=[None,2], dtype=tf.float32)

# 행은 상관 없고, 열은 2개

Y = tf.placeholder(shape=[None,1], dtype=tf.float32)

# 행은 상관 없고, 열은 1개

# Weight & bias

W = tf.Variable(tf.random_normal([2,1], name='weight')) # W 2개 생성

b = tf.Variable(tf.random_normal([1], name='bias'))

# Hypothesis, 가설

# H = tf.matmul(X,W) + b

logits = tf.matmul(X,W) + b

H = tf.sigmoid(logits) # 직선이 아닌 곡선으로 변경해야므로

# cost function

# cost = tf.reduce_mean(tf.square(H-Y)) # 최소제곱법

# 가설이 변경되어

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,

labels=Y))

# 학습, train node 생성

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 학습

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0:

print("cost : {}".format(cost_val))

# Logistic regression, Binary Classification

import tensorflow as tf

# training data set

x_data = [[1,0],

[2,0],

[5,1],

[2,3],

[3,3],

[8,1],

[10,0]]

y_data = [[0],[0],[0],[1],[1],[1],[1]]

# placeholder

X = tf.placeholder(shape=[None,2], dtype=tf.float32)

# 행은 상관 없고, 열은 2개

Y = tf.placeholder(shape=[None,1], dtype=tf.float32)

# 행은 상관 없고, 열은 1개

# Weight & bias

W = tf.Variable(tf.random_normal([2,1], name='weight')) # W 2개 생성

b = tf.Variable(tf.random_normal([1], name='bias'))

# Hypothesis, 가설

# H = tf.matmul(X,W) + b

logits = tf.matmul(X,W) + b

H = tf.sigmoid(logits) # 직선이 아닌 곡선으로 변경해야므로

# cost function

# cost = tf.reduce_mean(tf.square(H-Y)) # 최소제곱법

# 가설이 변경되어

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits,

labels=Y))

# 학습, train node 생성

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 학습

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0:

print("cost : {}".format(cost_val))

# 학습이 잘 되는거 같지만 검증 할 수가 없음

# linear regression 은 정확도를 검증 할 방법이 없음

# Accuracy 정확도 측정

# predoct = H > 0.5 이면 1로 간주, 논리값, TRUE or FALSE

predict = tf.cast(H > 0.5, dtype=tf.float32) # 예측값이 0 or 1로 변환

correct = tf.equal(predict, Y) # 예측값과 데이터값을 비교, [True, False ...]

# 결과를 실수로 변경 True -> 1, False -> 0

accuracy = tf.reduce_mean(tf.cast(correct, dtype=tf.float32)) # 평균을 구함

# 정확도를 구함, tensorflow node를 실행 후

print("정확도 : {}".format(sess.run(accuracy, feed_dict={X:x_data,Y:y_data})))

# 예측

print(sess.run(H, feed_dict={X:[[5,2]]})) # 5시간 공부 2년 외국체류

linear 는 선을 찾는 것

logistic regression model 은 선을 기준으로 판단, 2개 중에 하나

multinomial classification, 여러개 중에 하나

-> A이거나, A가 아닌 선

-> B이거나, B가 아닌 선

-> C이거나, C가 아닌 선

=> 3개 선을 찾음

행렬의 곱으로 찾음

one hot encoding

# multinomial classification

import tensorflow as tf

# data loading

# training data set

x_data = [[10,7,8,5],

[8,8,9,4],

[7,8,2,3],

[6,3,9,3],

[7,5,7,4],

[3,5,6,2],

[2,4,3,1]]

# y_data = [[A],

# [A],

# [B],

# [C],

# [C]]

y_data = [[1,0,0], # A

[1,0,0],

[0,1,0], # B

[0,1,0],

[0,0,1], # C

[0,0,1]]

# placeholder

X = tf.placeholder(shape=[None, 4], dtype=tf.float32) # x 쪽 데이타 형식

Y = tf.placeholder(shape=[None, 3], dtype=tf.float32)

# Wegiht & bias 지정

W = tf.Variable(tf.random_normal([4,3]), name='weight') # W 는 구할 값 ,

# W 12개 구함

b = tf.Variable(tf.random_normal([3]), name='bias') # logits A, B, C 3개

# Hypothesis

logits = tf.matmul(X,W) + b

#H = tf.sigmoid(logits) # H => A, B, C 각각의 확률을 구함

H = tf.nn.softmax(logits) # H => A, B, C 결과 합을 1이 되게 확률 값을 도출

# cost function

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits,

labels=Y))

# train 실행 node 만듬

# 학습, train node 생성

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 학습

for step in range(3000):

_, cost_val = sess.run([train, cost],

feed_dict={X:x_data,

Y:y_data})

if step % 300 == 0:

print("cost : {}".format(cost_val))

# accurcy 정확도 측정

# logistic H => 0.5 보다 크면 1

# multinomial H => 각각의 확률 0.3, 0.6, 0.1 => 0번째, 1번째, 2번째

# => one_hot_encoding => 1, 0, 0 => 0번째, 1번째, 2번째

# => 가장 큰 값이 몇 번째 있는가?

# => 0.6 => 1번째 , 1 => 0번째 : 두개의 위치가 다름

predict = tf.argmax(H, 1) # 가장 큰 값을 찾아 몇번째인지 알려줌

# 0 => 행단위로 가장 큰 것, 1=> 열단위로 가장 큰 것

correct = tf.equal(predict, tf.argmax(Y,1))

accuracy = tf.reduce_mean(tf.cast(correct, dtype=tf.float32))

# correct 를 갖고 평균을 구함

# 출력

print("정확도 : {}".format(sess.run(accuracy,

feed_dict={X:x_data,

Y:y_data})))

# predict 예측

print("예측값: {}".format(sess.run(H, feed_dict={X:[[9,9,7,5]]})))

# 평가 데이터를 입력 데이터로 그대로 사용함

# 때문에 정확도가 100% 인 맹점

# 입력데이터->학습 training

# 평가데이터를 따로 분리해야 함

# 이미지는 3차원 데이터, 가로x세로x컬러 => 1차원으로 변경

# x축 입력=>4차원(3차원이 여러개)=>2차원으로 변경 됨

# MNIST, 우체국, 우편번호분류기계

# 입력=> 꼬리표가 붙어서 들어옴

pip install matplotlib

# 데이터 확보

학습용 데이터

train-images-idx3-ubyte.gz, X측 데이터

train-labels-idx1-ubyte.gz, Y측 데이터

테스트,평가용 데이터, 정확도 측정

t10k-images-idx3-ubyte.gz

t10k-labels-idx1-ubyte.gz

print("label : {}".format(mnist.train.labels[0])) # 학습용 데이터의 y축 데이터

plt.imshow(mnist.train.images[0].reshape(28,28),

cmap="Greys") # 학습용 데이터의 이미지 출력

# 1차원 데이터를 2차원으로, 흑백으로

plt.show()

# Multinomial Classification

# MNIST

import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data

import matplotlib.pyplot as plt # 그래프 라이브러리

# Data Loading , y측 label을 one_hot 으로

mnist = input_data.read_data_sets('./data/mnist/', one_hot=True)

#print("label : {}".format(mnist.train.labels[0])) # 학습용 데이터의 y축 데이터

#plt.imshow(mnist.train.images[0].reshape(28,28),

# cmap="Greys") # 학습용 데이터의 이미지 출력

# 1차원 데이터를 2차원으로, 흑백으로

#plt.show()

# Placeholder 28 x 28 = 784, 0-9, logits 10개

X = tf.placeholder(shape=[None,784], dtype=tf.float32)

Y = tf.placeholder(shape=[None,10], dtype=tf.float32)

# Weight & bias

W = tf.Variable(tf.random_normal([784,10]), name='weight') # W 7840 개 구함

b = tf.Variable(tf.random_normal([10]), name='bias')

# Hypothesis

logits = tf.matmul(X,W) + b

H = tf.nn.softmax(logits)

# cost function

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits,

labels=Y))

# 학습, train node 생성

train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

# session & 초기화

sess = tf.Session()

sess.run(tf.global_variables_initializer())

# 학습

#for step in range(3000):

# sess.run([train,cost], feed_dict={X:mnist.train.images})

num_of_epoch = 20 # 전체 데이터를 가지고 한 번 학습하는 것 one epoch

batch_size = 100 # 한 번에 몇 장의 사진을 가져올 것인가,

# 한 번에 메모리에 올라갈 데이터 양

for step in range(num_of_epoch): # 20 epoch

# 반복 횟수 = 총 이미지 갯수 / 배치사이즈

num_of_iter = int(mnist.train.num_examples / batch_size)

cost_val = 0

for i in range(num_of_iter): # 300회

batch_x, batch_y = mnist.train.next_batch(batch_size) # 100개씩 가져옴

_, cost_val = sess.run([train,cost],

feed_dict={X:batch_x,

Y:batch_y})

if step % 3 == 0:

print("cost : {}".format(cost_val))

# Accuracy, 정확도 측정

predict = tf.argmax(H, 1) # 가장 큰 값을 찾아 몇번째인지 알려줌

correct = tf.equal(predict, tf.argmax(Y,1))

accuracy = tf.reduce_mean(tf.cast(correct, dtype=tf.float32))

# 정확도는 test 데이터로

print("정확도 : {}".format(sess.run(accuracy,

feed_dict={X:mnist.test.images,

Y:mnist.test.labels})))

# prediction 예측
# 손글씨로 직접 숫자를 그린 다음 스캐너로 이미지를 떠서
# 픽셀데이터를 뽑아내는 프로그램을 작성해서 이 데이터로 예측을 진행

# machine learning

# linear regression => 사용되기 어려뭄, 정확도 검증이 어려움
# logistic regression => XOR 문제 해결이 안됨
# multinomial => logistic 여러개를 모아 놓은 것

# logistic으로 해결되지 않는 문제
# 마치 사람이 뇌에서 생각하는 방식으로 이문제를 해결
# 신경망(neural network)을 이용하여 해결, 90년대
# logistic 결과 값을 다음 logistic 입력으로 넣음을 반복, 계층을 만듬

# neural network => deep learning
# Weight & bias 초기화 방법
# W = tf.Variable(tf.random_normal([784,10]), name='weight') # W 7840 개 구함
# b = tf.Variable(tf.random_normal([10]), name='bias')
# CNN 까지 진행하면 99% 까지 정확도를 올림 , 손글씨

# GAN

728x90

'푸닥거리' 카테고리의 다른 글

Windows 주요프로세스 (0)	2020.09.14
IIS에서 apk 파일 다운로드 가능하도록 설정 (0)	2020.09.14
데이터 분석 라이브러리 (0)	2020.07.30
딥러닝을 활용한 자연어 처리 (1)	2020.07.04
가용성 다단계 웹 테스트 등록 (0)	2020.07.03

┌( ￣∇￣)┘™

딥러닝을 활용한 이미지 처리

'푸닥거리' 카테고리의 다른 글

댓글

티스토리툴바

딥러닝을 활용한 이미지 처리

'푸닥거리' 카테고리의 다른 글

관련글

댓글

티스토리툴바