Table of Contents
Introduction to Python database
We use databases to store data, large amount of data which is not possible in text files or excel files. In this post, we will store market data in Python database using sqlite3 library, we will store into the database and retrieve from the database. Moreover, we will perform various queries on the database.
Let’s use stock market data as a real-world example. You’ll find it easy to grasp the concept.
Preparation for Python database
First we need our work environment ready, you should have Python installed on your system. If not then, visit here to go through a guide to install it.
Now, after Python is ready, we will install Sqlite3. Write the following command in your terminal
paradisecoder@penguin:~$ pip install db-sqlite3
Great, our work environment is now ready. Let’s begin with storing data.
Store data in Sqlite3 database
We have data in a .csv format. You can view the file by clicking below button, we will store this data in our database and retrieve from it and also query on it. But first, we are going to store our data.
We will create a folder called python_database and make a virtual environment there. This way, our files will be separated and it won’t conflict with your own projects.
We are installing virtualenv library, it will create a virtual python environment for us.
paradisecoder@penguin:~/python_database$ pip install virtualenv
Using this line, we are creating a virtual environment ‘myvenv’.
Note: If python3 is not working, you can replace it with python.
paradisecoder@penguin:~/python_database$ python3 -m venv myvenv
After it is done, we will activate our new virtual environment.
paradisecoder@penguin:~/python_database$ source myvenv/bin/activate
(myvenv) paradisecoder@penguin:~/python_database$
As you can observe, after activating myvenv, a bracket containing the virtual environment name (myvenv) can be seen before our username.
Installing necessary libraries
Type the following commands in the terminal to install necessary libraries.
paradisecoder@penguin:~/python_database$ pip install pandas
paradisecoder@penguin:~/python_database$ pip install db-sqlite3
Starting our project
Preparation is done. Virtual environement – check, Python installed – check, sqlite3 library installed – check. Now we will start with storing our data in a database.
Store data into the database
import sqlite3
import pandas as pd
#connecting to the database/making a new one
con = sqlite3.connect("database_1.db")
#making a cursor
cur = con.cursor()
#creating a variable for SQL command
command = f"create table if not exists symbol (date date primary key, open int, high int, low int, close int, volume int)"
#passing command variable to cursor
cur.execute(command)
#copying csv data to df1 variable in data frame format
df1 = pd.read_csv("market_data.csv")
#filling the symbol table with data frame data
df1.to_sql(symbol, con, if_exists='append', index = False)
#commiting changes
con.commit()
Retreiving data from the database
import sqlite3
import pandas as pd
#connecting with the database
con = sqlite3.connect("database_1.db")
#creating a cursor
cur = con.cursor()
#making SQL command and saving it in a variable
command1 = "select * from symbol"
#for loop for printing all the data in the cursor
for row in cur.execute(command1):
print(row)
#closing the connection
con.close()
Querying on the data
Get data from specific range
import sqlite3
import pandas as pd
#connecting with the database
con = sqlite3.connect("database_1.db")
#creating a cursor
cur = con.cursor()
#making SQL command and saving it in a variable
command1 = "select * from symbol where date between '2024-01-03' AND '2024-01-10'"
#for loop for printing all the data in the cursor
for row in cur.execute(command1):
print(row)
#closing the connection
con.close()
Select specific columns
import sqlite3
import pandas as pd
#connecting with the database
con = sqlite3.connect("database_1.db")
#creating a cursor
cur = con.cursor()
#making SQL command and saving it in a variable
command1 = "select open,high from symbol where date between '2024-01-03' AND '2024-01-10'"
#for loop for printing all the data in the cursor
for row in cur.execute(command1):
print(row)
#closing the connection
con.close()
Make data ascending
import sqlite3
import pandas as pd
#connecting with the database
con = sqlite3.connect("database_1.db")
#creating a cursor
cur = con.cursor()
#making SQL command and saving it in a variable
command1 = "select * from symbol order by high ASC"
#for loop for printing all the data in the cursor
for row in cur.execute(command1):
print(row)
#closing the connection
con.close()
Making data descending
import sqlite3
import pandas as pd
#connecting with the database
con = sqlite3.connect("database_1.db")
#creating a cursor
cur = con.cursor()
#making SQL command and saving it in a variable
command1 = "select * from symbol order by high desc"
#for loop for printing all the data in the cursor
for row in cur.execute(command1):
print(row)
#closing the connection
con.close()
Leave a Reply