I was working a solution to a StackoverFlow question and felt I could elaborate on the answer just a tiny bit.
In the original question, the questioner asked why they were unable to assign a decimal value (a float) to a cell in a pandas DataFrame. You can see the original question here.
QUESTION:
The question focused on trying to assign the answer to the following division equation to a cell in column 1 and row 1. Instead of getting a float, an integer is stored in that cell.
in: df['column1']['row1'] = 1 / 331616
in: df['column1']['row1']
out: 0
ANSWER:
My answer, with elaboration, follows…
pandas appears to be presuming that the datatype is an integer (int). This is because by default, pandas attempts to infer and assign a datatype to a column based on the data stored in the column. This link to the pandas documentation has a summary of the types of data that can be inferred.
If, when created, a DataFrame has integers stored in a given column, then you can’t later store floats in that column unless you set or change the datatype.
There are several ways to address this issue, either by setting the datatype to a float when the DataFrame is constructed OR by changing (or casting) the datatype (also referred to as a dtype) to a float on the fly. Let’s look at both techniques.
Setting the datatype (dtype) during construction:
>>> import pandas as pd
In making this simple DataFrame, we provide a single example value (1) and the columns for the DataFrame are defined as containing floats during creation
>>> df = pd.DataFrame([[1]], columns=['column1'], index=['row1'], dtype=float)
>>> df['column1']['row1'] = 1 / 331616
>>> df
column1
row1 0.000003
Converting the datatype on the fly:
>>> df = pd.DataFrame([[1]], columns=['column1'], index=['row1'], dtype=int)
>>> df['column1'] = df['column1'].astype(float)
>>> df['column1']['row1'] = 1 / 331616
>>> df
column1
row1 0.000003