In the real world, data is mostly in the format of files i.e. Excel sheets. These excel sheets are not well structured and it contains multiple sheets, which is very complex to handle and hard to visualize (larger data) with using cross sheets with some complex conditions. SQL does this task very easily with schema structure and joins.
SQL is a Structured Query Language, which is the base of all the SQL Database. for e.g. PostgreSQL, MongoDB, Microsoft SQL Server, etc.
SQL is much faster than Excel sheets. It can take hours to do a task in Excel and in SQL it takes minutes to perform the same task.
Every big company uses databases to handle data. So there is a requirement of converting this excel sheet data into SQL Tables.
Steps to convert the Excel data to SQL Table
Check that you have all the required libraries.
# To create the data frame into Table
from sqlalchemy import create_engine
#For manipulating data frame
import pandas as pd
First load the excel sheet.
index_col = 0 => It will set first column as index column.
header =[1,4] => It will set first 4 row as header which is column names.
Format the excel sheet so that columns contain the same datatypes and also handle the missing values.
You have four main options for converting types in pandas:
infer_objects() - a utility method to convert object columns holding Python objects to a pandas type if possible.
convert_dtypes() - convert DataFrame columns to the "best possible" dtype that supports
pd.NA (pandas' object to indicate a missing value).
Handling Missing Values:
fillna => It will fill the null value with the mean of that column.
For string type, we can replace it with string to handle missing or null values.
Create Engine in sqlalchemy
engine = create_engine(r’postgresql://user_name:password@server_ip_address:port_number/Database_name’, echo = True)
create_engine => It will create the engine for postgresql that will connect the server to the postgreSQL .
Type your username, password, server IP address, Port number, and Database Name to connect that specified server.
to_sql => It will convert the Excel sheet into specified table name.
if_exists=>It will replace the Table to handle multiple execution of program.
Replace will drop the table before inserting new values.
To Execute SQL query using the following command:
engine.execute(“Write the SQL Query”)
You can read more about the to_sql from the below link:
Read more about sqlalchemy: