My Blog for coding and note

https://github.com/hesthers

0%

Practice questions

Q:
Make Fibonacci sequence python code with Numpy module.

I already posted the answer code (my solution!) on my Naver blog.
Please refer to this blog after solving this problem.

You can also utilize defining function in python.

Solution: Blog Link

Numpy in Python

What is Numpy?

  • For numeric calculation of linear algebra and scientific computing
  • opensource so anyone can install and use Numpy package(module) for free
  • broadcasting: the data that has the different data type(shape) is possible to calculate with other data (vector)

Why is Numpy used?

  • Numpy is one of the powerful tools in python for calculation
  • Functions in Numpy module are very useful!

Some functions of Numpy

  1. arithmetic calculation function
  • prod() or cumprod(): mutliply
  • sum() or cumsum(): add
  • abs(): absolute value
  • square() or sqrt()
  • exp() (= exponential) or log()
  1. statistic calculation function
  • mean() / std() / var() / max() or min()
  1. logical calculation function
  • isnan()/unique()
  1. geometric calculation function
  • shape()
  • transpose()

Array in Numpy

  • the basic data type of numpy calculation
  • The calculation is relatively fast.

Indexing and Slicing in array of Numpy

  • This is the same as in the list data type.

Special shape of array

  • np.ones()
  • np.zeros()
  • the indentical matrix
    The two things, ones() and zeros() would be frequently used in data analysis.

Practice Questions

Lottery example:

  • Pick(Print) 6 random numbers without the duplication within the range between 1 and 45.

  • python code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import random
lotto_list = []
index = []
lotto = []
for i in range(6):
index.append(i)
lotto_list.append(random.randint(1, 45))
for n, m in zip(index, lotto_list):
lotto.append(m)
cnt = lotto.count(lotto[n])
if cnt >= 2:
lotto.remove(lotto[n])
lotto.append(random.randint(1, 45))
else:
lotto = lotto
lotto

Similar question:

  • Make 5 sets for a lottery.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
total_lotto = []
for i in range(5):
lotto_list = []
index = []
lotto = []
for i in range(6):
index.append(i)
lotto_list.append(random.randint(1, 45))
for n, m in zip(index, lotto_list):
lotto.append(m)
cnt = lotto.count(lotto[n])
if cnt >= 2:
lotto.remove(lotto[n])
lotto.append(random.randint(1, 45))
else:
lotto = lotto

if len(lotto) == 6:
break
total_lotto.append(lotto)
total_lotto

conditional statement and iteration statement

Python conditional statement

  • Literally, the conditional statement can be used in the conditions of data analysis on Python.

  • The reserved word of conditional statement is if/elif/else.

  • if: The first condition is true.

  • elif: if not true in the first condition, but other conditions are true.
  • else: when every above conditions are false

  • Format with example:

    1
    2
    3
    4
    5
    6
    if a > 15:
    print('High')
    elif a = 15:
    print('Yes!')
    else:
    print('Low')
  • Conditional statement is useful in various codes. (You’ll use it a lot!!)

Python iteration statement

  1. while:
  • repeat infinitely not until the reserved word break is used (in other words, repeat until conditions are met)
  • format:
    1
    2
    3
    while [conditions (able to be without conditions)]:
    statement..
    ~
  • if you stop the while statement because the conditions are alreay met, you have to use the reserved word break.
  • if you execute the statement and move on the next statement by passing only some specific condition, you can use the reserved word continue.
  1. for:
  • repeat during the defined number of times
  • use the range function with for loop (the automatically defined number of times)
  • format:
    1
    2
    3
    for i in [list, tuple ...] (or range(#, #+1):
    statement..
    ~

Data analysis with Python

Why python for Data analysis?

  • what values the analyst extracts
  • what the analyst want to get from data analysis
  • whether the analyst can make the right decision

The process of data analysis

  1. data collection (using open source/crawling …)
  2. data exploration (EDA methods with python)
  3. data preprocessing (using raw data or feature engineering)
  4. data modeling and feedback

The tools for data analysis

  • The most important thing: use the proper tool according to the purpose of data analysis
  • Need to know why you analyze the data and which functions are required

Python

  • A kind of programming (command to a computer with some specific language) language as interpreter language (executing line by line)
  • For communication between a human and computer
  • readability: easy to recognize and read codes
  • open source for python code: libraries and modules (e.g. numpy, pandas …)
  • automatically management of a memory in a computer
  • variables in python codes = space to save some data

Python

reference python link

Definition of Python

  • It is used for various purpose: data analysis, program engineering
  • interpreter language (compatible computing language)
  • easier to learn than other languages

Types of variables

  1. int(integer)
  2. float
  3. bool(True/False)
  4. str(string-characters)

Operators

  • The normal operation (calculation) is possible to not only numbers but also strings.
  • example:
    1
    2
    print('Hello!' *3)
    # The result: Hello!Hello!Hello!
  • Logical operators and comparative operators are also possible in python.

Data structures: 4 types

  1. List
  • format: [] (separation with ‘,’(comma))
  • indexing with slicing (This is the most important thing!!! in data analysis): python indexing
  • .append(): add the data in the list
  • .sort() or .sort(reverse=True): sorting the data by the ascending or descending order
  • .reverse: printing data by the reverse order of the list data
  1. Dictionary
  • format: {keys:values}
  • values in dictionary type: list type, set type, strings, int/float
  • print the values using the indexing
  • update or .append(): add or change the data
  1. Tuple
  • format: () (separation with ‘,’(comma))
  • The data is fixed, so changing the data is impossible.
  • Adding the data is possible.
  1. Set
  • format: {} (similar to dictonary but no keys and values)
  • remove the duplication (the unique data): if you add the same data a lot, the only one data is added into the data set.
  • the operation: intersection, union, difference

MongoDB

  • What is MongoDB?
    It is a kind of NoSQL, no relation between tables.
    Java Script is for MongoDB’s query.

  • Terminology on NoSQL

  1. tables = collections
  2. documents = records
  3. create databases = use [database name]
  4. query in MySQL = begin with Select but query in NoSQL = db.[collection name].find()
  5. if there are some conditions about data, add the conditions in ().
  6. operator begins with $.
  • comparative operator:
  1. $lt = < (less than), $lte = <= (less than and equal)
  2. $gt = > (greater than), $gte = >= (greater than and equal)
  3. $eq = = (is equal)
  • logical operator: $or, $and, $not, $nor
  1. function: begin with
    var [variance name] = function(~){
    formula
    return ~
    }
  2. making an order = sort: 1 = asc, -1 = desc, .sort({column name: 1 or -1})
  3. skip(#).limit(#) = limit #, # in MySQL
  • please check the extra examples and the formula in depth on my Naver blog and Github blog!

List of my Github blog

I appended the link of my github blog list on this page!

Please check here if you cannot find all posts.

My Github Blog link

SQL REPLICATION

I learned about SQL replication today, so I’m going to post this topic today on my blog.

  • definition of Replication:
    Imagine about copy. The replication on SQL is the same as copy.

  • the process of Replication:

First, you have to create each master and slave server on AWS website. You can also create various slave servers.

Second, using the public IP addresses of master and slave server, connect each server on git bash.
Use the command ssh -i ~/.ssh/[key pair name].pem ubuntu@[public IP address]. (You should install ubuntu before.)

Third, on master and slave server, install MySQL and connect MySQl and server.Create the new account of MySQL on git bash.

Fourth, on MySQL that connected with master server on git bash, create database.

Fifth, come back MySQL program and create new connections with master and slave servers.

Sixth, on the master DB connection of MySQL, type some queries to connect with the slave DB connection.

Seventh, insert some data on the master DB.

Eighth, check if the replication is successful on the slave DB connection.

If you see the yes on the slave status, you succeed!!!!! You’re done!

SQL Backup

  • You can back the files or databases up on MySQL using the server.

  • If you delete some data by mistake, you can restore them, but only under the condition that you have to back them up before.

  • types:

  1. depending on the environment of the backup
  • hot backup: when executing DB on SQL
  • cold backup: when stopping the execution of DB on SQL
  1. depending on the way to back up
  • logical backup: store data by changing into SQL Query, easy to check the errors, use when storing the data is a few amount
  • physical backup: take the relatively large size of file, the fast speed of the backup and the restoration, use when storing large amount of data and backing them up as soon as possible
  • hot logical backup:
  1. using crontab and shell script, check the logs and times for the backup

  2. with cyberduck, upload the files or databases to back up

  • cold physical backup:
  1. when you have the authorized access before you backing the data up

  2. you have to log out the ubuntu account then access to again

  3. you have to make the AWS server for the backup first

  • IF you feel that using the virtual server, it is difficult to back the data up, you can back the data up on SQL program. (Export data)

SQL Join

  • Join means that a table is connected with another table or other tables through a specific, mutual columns.

  • types:

  1. inner join
  2. left join
  3. outer join: left/right join & union
  4. right join
  • if you make a join on tables, write “(inner)/left/right join ~ on” in a query.

  • union: remove the duplicated data

  • if you make a join on three tables, you can write the names of three tables next to the reserved word “from”.

SQL constraints

  • when making a table on SQL, you can make constraints on each columns.
  1. unique: you cannot save the same data on a table. It is the only data such as SSN, not duplicated.
  2. primary: this key has the attributes of Not Null and Unique.
  3. foreign: when connecting a table with another table, you can constrain another table from adding other data. if you break the foreign key of constraints, it also breaks the integrity.
  • Extra constraints
  1. not null: you cannot save the table without data.
  2. default: the values that you always have, it is the primary value (the primary key and primary value is different here!)
  • when having constraints, you can apply tables to use and analyze data better.

SQL Basics

  • when using SQL, begin with select or SELECT with the end ;(semi-colon).

example:

1
2
3
select name, no
from info
where no >= 13;
  • basic sentence to execute SQL (This sentence is the same as in all RDBMS.)
1
2
3
4
5
SELECT columns...
FROM table names
[WHERE condition statement #using the condition statement here! (with arithmetic operator)
ORDER BY columns... #asc (= default) or desc
LIMIT ...];
  • between: in the range

if the number is more than 15, less than 24,
write like the following:

1
2
3
select ~~
from ~~
where no between 15 and 24;
  • in/not in: in = including, not in = not including

  • like: when the specific string is included in data

Practice for function

  • Question: Enter your name and print name with the greeting using function.
1
2
3
4
5
6
7
8
def greeting(n):
if n.isalpha():
return "Dear " + name + ", Hello!"
elif n.isdigit() or n.isalnum():
print('Is it your name?')

name = str(input("Please input your name: "))
greeting(name)

Python example code

Practice question for Python

문제: 연필 한 다스는 연필 12개이다. 원하는 만큼의 수를 입력받아 그 수는 연필 몇 다스이고 몇 개인지 출력하기

1
2
3
4
pencil = int(input('Enter the number of pencils: '))

print('입력한 연필 수는 %d다스' % (pencil // 12), end = ',')
print(' 나머지는 %d개입니다.' % (pencil % 12))