Setting datatypes in pandas DataFrames

I was working a solution to a StackoverFlow question and felt I could elaborate on the answer just a tiny bit.

In the original question, the questioner asked why they were unable to assign a decimal value (a float) to a cell in a pandas DataFrame. You can see the original question here.

QUESTION:
The question focused on trying to assign the answer to the following division equation to a cell in column 1 and row 1. Instead of getting a float, an integer is stored in that cell.

in: df['column1']['row1'] = 1 / 331616
in: df['column1']['row1']
out: 0

ANSWER:
My answer, with elaboration, follows…

pandas appears to be presuming that the datatype is an integer (int). This is because by default, pandas attempts to infer and assign a datatype to a column based on the data stored in the column. This link to the pandas documentation has a summary of the types of data that can be inferred.

If, when created, a DataFrame has integers stored in a given column, then you can’t later store floats in that column unless you set or change the datatype.

There are several ways to address this issue, either by setting the datatype to a float when the DataFrame is constructed OR by changing (or casting) the datatype (also referred to as a dtype) to a float on the fly. Let’s look at both techniques.

Setting the datatype (dtype) during construction:

>>> import pandas as pd

In making this simple DataFrame, we provide a single example value (1) and the columns for the DataFrame are defined as containing floats during creation

>>> df = pd.DataFrame([[1]], columns=['column1'], index=['row1'], dtype=float)
>>> df['column1']['row1'] = 1 / 331616
>>> df
      column1
row1 0.000003

Converting the datatype on the fly:

>>> df = pd.DataFrame([[1]], columns=['column1'], index=['row1'], dtype=int)
>>> df['column1'] = df['column1'].astype(float)
>>> df['column1']['row1'] = 1 / 331616
>>> df
      column1
row1 0.000003

Advertisements

Counting objects

The collections library has many wonderful tools, including the
Counter object, which greatly simplifies the counting of objects.

Let’s look at several examples:

  • counting letters in a string
  • counting items in a list
  • counting tuples within a tuple
  • special functions focused on using the results of a count

We’ll start by looking at three simple Python variables that contain elements we want to count:

>>> mystring = 'python pythonista pythonic the pythons'
>>> mylist = [11, 22, 22, 33, 33, 33, 42]
>>> mytuples = (('first', 'alpha'),
                ('first', 'alpha'), 
                ('second', 'beta'), 
                ('third', 'gamma'))

We import the Counter object from the collections module:

from collections import Counter

Then we provide the item we want counted as an argument to the Counter class:

>>> mystring = 'python pythonista pythonic the pythons'
>>> chars = Counter(mystring)
>>> chars
Counter({' ': 4,              # NOTE: the space is a character
         'a': 1,              # The counts are not in any order
         'c': 1,
         'e': 1,
         'h': 5,
         'i': 2,
         'n': 4,
         'o': 4,
         'p': 4,
         's': 2,
         't': 6,
         'y': 4})

This works just as well with our other samples:

>>> mylist = [11, 22, 22, 33, 33, 33, 42]
>>> integers = Counter(mylist)
>>> integers
Counter({11: 1, 22: 2, 33: 3, 42: 1})
>>> mytuples = (('first', 'alpha'), 
                ('first', 'alpha'), 
                ('second', 'beta'), 
                ('third', 'gamma'))
>>> tuples = Counter(mytuples)
>>> tuples
Counter({('first', 'alpha'): 2,
         ('second', 'beta'): 1, 
         ('third', 'gamma'): 1})

We can display the most common items in a Counter:

We do so, using the .most_common() function.
By providing an argument n to the function, we can limit our results to just nelements:

>>> chars.most_common(3)
[('t', 6), ('h', 5), ('p', 4)]

Interesting, we can see the elements organized into groups, based on the actual character, number, etc.

The .elements() method creates an iterable object, so to easily see the contents, it is common to encapsulate the result in another function, such as list() or sorted().

>>> list(chars.elements())
['p', 'p', 'p', 'p', 'y', 'y', 'y', 'y', 't', 't', 't', 't', 't', 't',
 'h', 'h', 'h', 'h', 'h', 'o', 'o', 'o', 'o', 'n', 'n', 'n', 'n', ' ',
 ' ', ' ', ' ', 'i', 'i', 's', 's', 'a', 'c', 'e']

>>> sorted(chars.elements())
[' ', ' ', ' ', ' ', 'a', 'c', 'e', 'h', 'h', 'h', 'h', 'h', 'i', 'i',
 'n', 'n', 'n', 'n', 'o', 'o', 'o', 'o', 'p', 'p', 'p', 'p', 's', 's',
 't', 't', 't', 't', 't', 't', 'y', 'y', 'y', 'y']

Counters come with a variety of other methods, most of which mirror dictionary methods:

['clear', 'copy', 'elements', 'fromkeys', 'get', 'items', 'keys',
 'most_common', 'pop', 'popitem', 'setdefault', 'subtract', 'update',
 'values']

Happy coding!

PyHawaii puzzle 20170102

The linked file (1138090f) is based on an Official Scrabble Word list. NOTE: while it is labeled as a .doc file, it is really a .txt file, so you don’t need any special libraries to open it/read it***.

Sort the words in the file by length (NOTE: it is important that you perform a stable sort, i.e. in considering the following words: ‘can’, ‘cart’, ‘cat’, ‘dog’, ‘door’… when sorted by length, the items with the same length should still retain their original relative order, as follows… ‘can’, ‘cat’, ‘dog’, ‘cart’, ‘door’)

Once you have sorted the list by length, find the words with the following indexes…

22615, 10582, 353, 1660, 43880
The words will reveal the secret phrase.

The backstory:

The attached file is part of the Moby Project which collected a wide variety of word lists. The scrabble dictionary above comes from the following compressed archive: mwords.tar.Z [4.0MB]

Notes:

*** wordpress has annoying restrictions on what can be uploaded. apparently .txt files are verboten, but .doc files are not. = /

Want more puzzles?

See this page for links to more puzzles…

PyNow! PyHawaii’s first one-day Python Conference

PyNow Call for papers is now open:

PyNow! the first ever Python Conference in Hawaii is looking for speakers, like you! to give talks on the following, sign up here!

  • Best Practices & patterns
  • Community
  • Databases
  • Data Science / Data Analysis
  • Education
  • Embedded Systems
  • Gaming
  • Python Core (language, standard library, etc)
  • Python Internals
  • Python Libraries
  • Security
  • Systems Administration
  • Testing
  • Web Frameworks

This event is being sponsored by

pyhawaii_200x200 PyHawaii (see PyHawaii’s meetup site!)
PCATT_Logo_Colored_consortium PCATT Pacific Center for Advanced Technology Training
uhhcc University of Hawaii Honolulu Community Colleges
BAH 100TH LOGO_black_red_tagline.jpg  Booz Allen Hamilton
logos.3.600.wide The Dark Art of Coding

– Chalmer

 

Things are happening at PyHawaii: more meetups.

PyHawaii, your local Python User Group in Hawaii, is initiating two efforts:

0) a weekly series of unstructured meetups (starting on July 22nd).

1) a monthly series of larger, social structured meetups (starting on July 24th).

Visit our meetup site for details on the when and where. Can’t wait to see you there!

Interested in being even more involved with the Pacific’s Premier Python User Group?

Join our pyhawaii.slack.com community, where you can chat, share resources, ask questions and more: email us for an invite at chalmer@pyhawaii.com.

– Chalmer

 

Congrats to PyHawaii for last month’s Meetup… More successful than we dared hope…

Dark Art of Coding is a founding sponsor for PyHawaii and we want to congratulate them on their first event – a huge success!

They had 38 attendees from across the Island engaged in three primary activities:
* Padawan Track – Dark Art of Coding led the training for absolute beginners using hands-on learning opportunities.
* Puzzles – they offered up a set of easy, medium and hard puzzles to test your coding skills (designed predominantly by Dark Ark with some help from our friends: special shout out James M!)
* Misc Fun – they also had plenty of time to hang out and enjoy each other’s company

– Chalmer

Welcome to PyHawaii’s Inaugural Meeting

Our Agenda for the Day
* Introductions
* First look at PyHawaii goals, future agendas
* Puzzles: a puzzle solving session (work together/separately to tackle a wide variety of puzzles for fame and glory)
* TechTalk: see a demo on tablib, a @kennethreitz’s library for handling tabular datasets
* Padawan Track: new to programming/Python? we’ll get you started with a one-hour lesson on Python Basics for complete newbies.

Have ideas about PyHawaii or future renditions of it that you would like us to consider? Let us know! Add them to our google doc.

Want to do the puzzles? Dive in right here!

Enjoy your time here, with us. If you have any questions, ask! We are here to help.

Also! Welcome to our friends in the HIPUG Meetup group! and at HiCapacity!

Stay in touch:
Twitter: @py_hawaii
Facebook: https://www.facebook.com/pyhawaii

Have something you would like to share more privately? Reach out to the organizers at:
pyhawaii@googlegroups.com OR Let us now if you want to be in the conversation via Slack!