create account

Pandas DataFrame by leoumesh

View this thread on: hive.blogpeakd.comecency.com
· @leoumesh ·
$4.66
Pandas DataFrame
<div class="text-justify">

![image.png](https://files.peakd.com/file/peakd-hive/leoumesh/23tvgY6dGorMvqefy5UBPxo45YUFX5U6qw74e2uktdJUHf29T3foUNkuPdcCjFJhnM815.png)
<center>[Image Source](https://www.linkedin.com/pulse/pandas-dataframe-functions-madhavan-vivekanandan)</center>

In the last [post](https://hive.blog/hive-196387/@leoumesh/intro-to-pandas-library-python), I gave a brief introduction to pandas library and one of its main data structure which is series. In this post, I am going to briefly talk about another data structure which is DataFrames. DataFrame is used to store data in two dimensional form, or in another word-tabular form in terms of rows and columns. Rows are used to store the information while columns are used to label the information. DataFrame can also be said as a collection of series as I discussed in my previous post. There are many things you can do with pandas dataframes like manipulating the data which includes indexing, merging, sorting, redefining the data like modifying, adding or deleting rows/column, cleaning and preparing the data by filling the null or NaN values, and so on.

Dataframe makes it easier for data to be used for visualization and analysis purposes. And the best things about pandas is that it supports most of the file extensions type like JSON, plain text, CSV and so on. Here we will do some coding related to DataFrame. The syntax for creating a DataFrame is quite similar to that of series. We will create a weather dataframe that contains 5 data about date, city, temperature, humidity and precipitation value for particular US cities.

```
import pandas as pd

row_labels = [0,1,2,3,4]
column_labels = ['Date', 'City', 'Temperature', 'Humidity', 'Precipitation (in mm)']
data = [['2024-01-01', 'New York', 30, 80, 0.2], 
['2024-01-02', 'Los Angeles', 60, 65, 0.0],
['2024-01-03', 'Chicago', 25, 78, 0.1],
['2024-01-04', 'Houston', 50, 60, 0.4],
['2024-01-05', 'Phoenix', 45, 55, 0.3]]

df = pd.DataFrame(index=row_labels, data=data, columns = column_labels)

df
```
So you may have seen above at first we imported pandas library and then define labels for column and rows. Then we filled the data for each of the 5 columns for 5 US cities. Then we used ***DataFrame*** function to create a dataframe which takes some argument like index itself, the row, columns and the data. There is an optional argument that you can pass here which is a ***datatype (dtype)***. Now lets see the output in tabular format.


![image.png](https://files.peakd.com/file/peakd-hive/leoumesh/23xA22yY3CfUffLoUYeW3wSZCptGtgWrg4i6pUdG8SksK96JnsP1Le6j7am6ey4D2RYkD.png)

For the data, you can pass it any format like you want. I had passed the data in nested list form but you can pass set, tuple as you like and it will still return the same result. Now, you can see the index to be 0 till 5. I want the city to be index. We can do it by:
```
df.set_index("City")
```
If you run the above code, you will get following output:

![image.png](https://files.peakd.com/file/peakd-hive/leoumesh/23t74pDhgCaPBrCSYHQ7AK9GLFivwu64ie8aHxUDkibgqqkCuRWWCY4wb5pAHyysaCwN5.png)

Note that specifying index like this won't change the original dataframe with city being the index. If you print your dataframe after doing this, you will get the original dataframe like below:

![image.png](https://files.peakd.com/file/peakd-hive/leoumesh/23t732e3F9fabxETC4p3dMyoQeS4RQ9bwdEsMXnHjdbSoJEEviVCR4DpKkX3NEQXPyinT.png)

In order to actually apply the change to the original dataframe there is optional argument that you can pass to **set_index()** function which is as below:

```
df.set_index("City", inplace=True)

df
```
Now you can see your desired output like below:

![image.png](https://files.peakd.com/file/peakd-hive/leoumesh/23sweHpQsMV4qpay3N2gVDyVnfZeSV9TbTQhReuVXYF2W9YY6yLvjSsDBd6rNQ9Bu51fe.png)

by default **inplace** value is set to false. There is also other optional argument called **drop** which is by default set to True. When you set it to False, then the City label will stay there as it was previously and as an index as well. You can reset the index by using `reset_index` method as below:

```
df.reset_index(drop=False)
```
If you don't set `drop=False` then city will be dropped and we won't get back our original dataframe. The output of above code is:


![image.png](https://files.peakd.com/file/peakd-hive/leoumesh/23t72QtMNBoB8Jyfk9HTnWoa9o2iN7g9pWDVe3wxFnM1ToBzsCJwXapnggMUEThxwJAj3.png)

So that's all for now regarding basic of pandas dataframe. From next post, we will talk about reading CSV files using pandas and further posts will discuss about manipulating data using pandas library.

</div>
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 310 others
properties (23)
authorleoumesh
permlinkpandas-dataframe
categoryhive-196387
json_metadata{"app":"peakd/2024.7.2","format":"markdown","tags":["stem","python","code","coding","computerscience","tutorial","programming","pandas"],"users":["leoumesh"],"image":["https://files.peakd.com/file/peakd-hive/leoumesh/23tvgY6dGorMvqefy5UBPxo45YUFX5U6qw74e2uktdJUHf29T3foUNkuPdcCjFJhnM815.png","https://files.peakd.com/file/peakd-hive/leoumesh/23xA22yY3CfUffLoUYeW3wSZCptGtgWrg4i6pUdG8SksK96JnsP1Le6j7am6ey4D2RYkD.png","https://files.peakd.com/file/peakd-hive/leoumesh/23t74pDhgCaPBrCSYHQ7AK9GLFivwu64ie8aHxUDkibgqqkCuRWWCY4wb5pAHyysaCwN5.png","https://files.peakd.com/file/peakd-hive/leoumesh/23t732e3F9fabxETC4p3dMyoQeS4RQ9bwdEsMXnHjdbSoJEEviVCR4DpKkX3NEQXPyinT.png","https://files.peakd.com/file/peakd-hive/leoumesh/23sweHpQsMV4qpay3N2gVDyVnfZeSV9TbTQhReuVXYF2W9YY6yLvjSsDBd6rNQ9Bu51fe.png","https://files.peakd.com/file/peakd-hive/leoumesh/23t72QtMNBoB8Jyfk9HTnWoa9o2iN7g9pWDVe3wxFnM1ToBzsCJwXapnggMUEThxwJAj3.png"]}
created2024-07-19 04:12:36
last_update2024-07-19 04:12:36
depth0
children3
last_payout2024-07-26 04:12:36
cashout_time1969-12-31 23:59:59
total_payout_value2.290 HBD
curator_payout_value2.367 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length4,660
author_reputation212,340,493,251,438
root_title"Pandas DataFrame"
beneficiaries
0.
accountstemsocial
weight500
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id135,383,260
net_rshares14,730,738,060,527
author_curate_reward""
vote details (374)
@dobro2020 ·
it is good example of how do tables with python but it is better sql :D
properties (22)
authordobro2020
permlinksgvq29
categoryhive-196387
json_metadata{"app":"hiveblog/0.1"}
created2024-07-19 16:35:00
last_update2024-07-19 16:35:00
depth1
children1
last_payout2024-07-26 16:35:00
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length71
author_reputation66,734,662,885,414
root_title"Pandas DataFrame"
beneficiaries
0.
accounthiveonboard
weight100
1.
accounttipu
weight100
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id135,393,235
net_rshares0
@leoumesh ·
Both of them excel at their own purpose. SQL is better used for extracting and filtering while pandas is better used for manipulation.
properties (22)
authorleoumesh
permlinksgyyvq
categoryhive-196387
json_metadata{"app":"hiveblog/0.1"}
created2024-07-21 10:38:42
last_update2024-07-21 10:38:42
depth2
children0
last_payout2024-07-28 10:38:42
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length134
author_reputation212,340,493,251,438
root_title"Pandas DataFrame"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id135,439,207
net_rshares0
@stemsocial ·
re-leoumesh-pandas-dataframe-20240719t050324927z
<div class='text-justify'> <div class='pull-left'>
 <img src='https://stem.openhive.network/images/stemsocialsupport7.png'> </div>

Thanks for your contribution to the <a href='/trending/hive-196387'>STEMsocial community</a>. Feel free to join us on <a href='https://discord.gg/9c7pKVD'>discord</a> to get to know the rest of us!

Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).

Thanks for including @stemsocial as a beneficiary, which gives you stronger support.&nbsp;<br />&nbsp;<br />
</div>
properties (22)
authorstemsocial
permlinkre-leoumesh-pandas-dataframe-20240719t050324927z
categoryhive-196387
json_metadata{"app":"STEMsocial"}
created2024-07-19 05:03:24
last_update2024-07-19 05:03:24
depth1
children0
last_payout2024-07-26 05:03:24
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length545
author_reputation22,927,767,309,334
root_title"Pandas DataFrame"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id135,383,781
net_rshares0