Python Programming

Lecture 4 Strings, Dictionaries

4.1 Strings

  • A string is a sequence of characters. The elements of a string are characters. Empty string ''.(not' ') You can access the characters one at a time with the bracket operator:

  • 
    >>> fruit = 'banana'
    >>> letter = fruit[1]
    >>> print(letter)
    a
    
    >>> len(fruit)
    6
    
  • iteration and conditional execution

  • 
    fruit = 'banana'
    for char in fruit:
        print(char)
    
    
    word = 'banana'
    count = 0
    for letter in word:
        if letter == 'a':
            count = count + 1
    print(count)
    
  • String slices


>>> fruit = 'banana'
>>> fruit[1:3]
'an'
>>> fruit[3:]
'ana'
  • Strings are immutable, lists are mutable


>>> greeting = 'Hello, world!'
>>> greeting[0] = 'J'
TypeError: 'str' object does not 
support item assignment

>>> greeting = 'Hello, world!'
>>> new_greeting = 'J' + greeting[1:]
>>> print(new_greeting)
Jello, world!
  • The in operator


>>> print('a' in 'banana')
True
>>> print('seed' in 'banana')
False
>>> print('x' in ['x','y','z'])
True
  • String comparison

    Comparison operations are useful for putting words in alphabetical order.


>>> print('apple'>'banana')
False
>>> print('ba' > 'banana')
False
>>> print([0, 1, 2] < [0, 3, 4])
True
>>> a_list = ["orange", "apple", "banana"]
>>> a_list.sort()
>>> print(a_list)
['apple', 'banana', 'orange']
  • Some useful string methods


>>> stuff = 'Hello world'
>>> dir(stuff)

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__',
 '__eq__', '__format__','__ge__', '__getattribute__', '__getitem__', 
 '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', 
 '__iter__', '__le__','__len__', '__lt__', '__mod__', '__mul__', '__ne__', 
 '__new__', '__reduce__','__reduce_ex__', '__repr__', '__rmod__', '__rmul__', 
 '__setattr__', '__sizeof__','__str__', '__subclasshook__', 'capitalize', 
 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 
 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 
 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 
 'isupper', 'join', 'ljust','lower', 'lstrip', 'maketrans', 'partition', 
 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 
 'split', 'splitlines', 'startswith','strip', 'swapcase', 'title', 'translate', 
 'upper', 'zfill']

>>> help(str.title)
Help on method_descriptor:
title(...)
    S.title() -> str
    Return a titlecased version of S, i.e. words start with title case
    characters, all remaining cased characters have lower case.
  • .title(), .lower(), .upper()


>>> name = "ada lovelace"
>>> print(name.title())
Ada Lovelace

>>> print(name)
ada lovelace

>>> name = "Ada Lovelace"
>>> print(name.upper())
ADA LOVELACE

>>> print(name.lower())
ada lovelace
  • .find() searches for the position of a string in another string


>>> word = 'banana'
>>> index = word.find('a')
>>> print(index)
1
>>> word.find('na')
2
>>> word.find('na', 3)
4
  • .startswith() returns the boolean value


>>> line = 'Have a nice day'
>>> line.startswith('h') 
False

>>> line.lower()
'have a nice day'

>>> line.lower().startswith('h')
True
  • Parsing strings


>>> data = 'From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008'
>>> atpos = data.find('@')
>>> print(atpos)
21
>>> sppos = data.find(' ',atpos)
>>> print(sppos)
31
>>> host = data[atpos+1 : sppos]
>>> print(host)
uct.ac.za
  • .split() break a sentence into words and make a list


>>> s = 'pining for the fjords'
>>> t = s.split()
>>> print(t)
['pining', 'for', 'the', 'fjords']
>>> print(t[2])
the
  • One can break a word into letters.


>>> s = 'spam'
>>> t = list(s)
>>> print(t)
['s', 'p', 'a', 'm']
  • Format operator $%$

  • It allows us to construct strings, replacing parts of the strings with the data stored in variables.

  • 
    >>> number = 42
    >>> print('I have spotted %d camels.' % number)
    I have spotted 42 camels.
    
  • What if we do not use format operator?


>>> number = 42
>>> print('I have spotted number camels.') #error.
>>> print('I have spotted '+str(number)+' camels.') #not simple
  • If there is more than one format sequence in the string, use ().

  • 
    >>> print('In %d years I have spotted %g %s.' % (3, 0.1, 'camels'))
    In 3 years I have spotted 0.1 camels.
    
  • The number of elements in the tuple must match the number of format sequences in the string. The types of the elements also must match the format sequences.

  • .format()
  • 
    >>> print('In {} years I spotted {} {}.'.format(3, 0.1, 'camels'))
    In 3 years I have spotted 0.1 camels.
    
    >>> print('In {0} years I have spotted {1} {2}.'.format(3, 0.1, 'camels'))
    In 3 years I have spotted 0.1 camels.
    
    >>> print('In {1} years I have spotted {0} {2}.'.format(3, 0.1, 'camels'))
    In 0.1 years I have spotted 3 camels.
    

String: Summary

  • The elements of a string are characters. Empty string ''

  • Features: Ordered, Immutable, Repeatable

  • Index and slice are the same with that of lists.

  • in operator shows the boolean value for whether a string contains a given string. You can compare two strings in Alphabetical order. (Lists can do similar things)

  • .upper(), lower(), .title(), .find(), .startwith(), .split() (list('string') and list())

  • Format operator: %d integer, %g float, %s string (use () for more than one format sequence), .format()

4.2 Dictionaries

  • A dictionary is like a list, but more general. In a list, the index positions have to be integers; in a dictionary, the indices can be (almost) any type.

  • 
    >>> x = {} #This is False
    >>> print(x)
    {}
    
  • The empty dictionary can be written as dict(). Similarly, an empty list can be list().

  • You can think of a dictionary as a mapping between a set of indices (which are called keys) and a set of values.

  • 
    >>> x['one'] = 'uno'
    >>> print(x)
    {'one': 'uno'}
    
  • In general, the order of items in a dictionary is unpredictable.

  • 
    >>> x = {'one': 'apple', 'two': 'banana', 'three': 'orange'}
    >>> print(x)
    {'one': 'apple', 'three': 'orange', 'two': 'banana'}
    
    >>> print(x['two'])
    'banana'
    
  • Adding New Key-Value Pairs

  • 
    >>> alien = {}
    >>> alien['color'] = 'green'
    >>> alien['point'] = 5
    >>> print(alien)
    {'color': 'green', 'points': 5}
    
  • To create a dictionary, start with an empty one, and add the key-value pairs. The keys should be immutable and there is no repeated key. Lists cannot be keys in the dictionary.

  • Modifying Values in a Dictionary

  • 
    >>> alien['color'] = 'yellow'
    >>> print(alien)
    {'color': 'yellow', 'points': 5}
    
  • Removing Key-Value Pairs

  • 
    >>> del alien['points']
    >>> print(alien)
    {'color': 'yellow'}
    
  • Merge two dictionaries


>>> dict1 = { "name":"owen", "age": 18 }
>>> dict2 = { "birthday": "1999-11-22"}
>>> x = dict( dict1, **dict2 )
>>> print(x)
{'name': 'owen', 'age': 18, 'birthday': '1999-11-22'}

>>> a = { 'x' : 1 , 'y' : 2 }
>>> b = { 'y' : 3 , 'z' : 4 }
>>> c = dict(a,**b)
>>> print(c)
{'x': 1, 'y': 3, 'z': 4}

in operator

  • The in operator works on dictionaries. (keys)

  • 
    >>> x = {'one': 'apple', 'two': 'banana', 'three': 'orange'}
    >>> 'one' in x
    True
    >>> 'uno' in x
    False
    
  • To see whether something appears as a value in a dictionary, you can use the method .values().

  • 
    >>> x = {'one': 'apple', 'two': 'banana', 'three': 'orange'}
    >>> print('orange' in x.values())
    True
    
  • The len function works on dictionaries; it returns the number of key-value pairs

  • 
    >>> x = {'one': 'apple', 'two': 'banana', 'three': 'orange'}
    >>> len(x)
    3
    

Methods of Dictionary

  • Suppose you are given a string and you want to count how many times each letter appears.
  • 
    word = 'banana'
    d = dict()
    for c in word:
        if c not in d:
            d[c] = 1
        else:
            d[c] = d[c] + 1
    print(d)
    
    
    {'b': 1, 'a': 3, 'n': 2}
    
  • Dictionaries have a method called .get() that takes a key and a default value.
  • 
    >>> counts = { 'chuck' : 1 , 'annie' : 42, 'jan': 100}
    >>> print(counts.get('jan', 0)) #0 is the default value
    100
    >>> print(counts.get('anni', 1550))
    1550
    
    
    word = 'banana'
    d = dict()
    for c in word:
        d[c] = d.get(c,0) + 1
    print(d)
    
  • .items(), .keys(), .values()
  • Looping Through All Key-Value Pairs


favorite = {
    'jen': 'python',
    'sarah': 'c',
    'edward': 'ruby',
    }
for x, y in favorite.items():
    print(x, y)

for x, y in list(favorite.items()):
    print(x, y)

jen python
sarah c
edward ruby
  • Looping Through All the Keys in a Dictionary


favorite = {
    'jen': 'python',
    'sarah': 'c',
    'edward': 'ruby',
    }
for x in favorite.keys():
    print(x.title())

#for x in list(favorite.keys()):

Jen
Sarah
Edward

friends = ['sarah', 'jen']
for x in favorite.keys():
    if x in friends:
        print(x.title()+" "+favorite[x])


Jen python
Sarah c
  • Looping Through All Values in a Dictionary


favorite = {
    'jen': 'python',
    'sarah': 'c',
    'edward': 'ruby',
    }
for y in favorite.values():
    print(y.title())

#for language in list(favorite.values()):

Python
C
Ruby

Nesting

  • You can nest a set of dictionaries inside a list, a list of items inside a dictionary, or even a dictionary inside another dictionary. A list or a dictionary cannot be the key (The key should be immutable).

  • Dictionaries in a list

  • 
    alien_0 = {'color': 'green', 'points': 5}
    alien_1 = {'color': 'yellow', 'points': 10}
    alien_2 = {'color': 'red', 'points': 15}
    
    aliens = [alien_0, alien_1, alien_2]
    
  • A List in a Dictionary

  • 
    pizza = {
        'crust': 'thick',
        'toppings': ['mushrooms', 'extra cheese'],
        }
    print(pizza['crust'], pizza['toppings'][1])
    
    
    thick extra cheese
    
  • You can nest a list inside a dictionary any time you want more than one value to be associated with a single key in a dictionary


favorite_languages = {
    'jen': ['python', 'ruby'],
    'sarah': ['c'],
    'edward': ['ruby', 'go'],
    'phil': ['python', 'haskell'],
    }
print(favorite_languages['jen'][0].title())

Python
  • A Dictionary in a Dictionary


users = {
    'aeinstein': {
        'first': 'albert',
        'last': 'einstein',
        'location': 'princeton',
        },

    'mcurie': {
        'first': 'marie',
        'last': 'curie',
        'location': 'paris',
        },
    }

full_name = users['aeinstein']['first']+" "+users['aeinstein']['last']
location = users['aeinstein']['location']

print("Full name: " + full_name.title())
print("Location: " + location.title())


Full name: Albert Einstein
Location: Princeton

Dictionaries: Summary

  • Elements are key-value pairs. Empty dictionary {}

  • Features:unpredictable order, key-value pairs are mutable (keys are immutable, but you can modify values). Do not make repeated keys.

  • .get() (value or default value)

    Looping: .items(), .keys(), .values()

  • in operator works on keys (but you can use .values())

  • Nesting: A list of dictionaries, a list in a dictionary, a dictionary in a dictionary

4.3 Lottery(大乐透分析)

Targets

  • 1. We have a list of lottery numbers in 2020. Pick up the five-number groups from the list.
  • 2. Find the frequency of each number in the 5-number groups.
  • 3. Sort those numbers by their frequencies
  • The feature of data:
  • 
    [[2020001, "17 25 26 32 34-04 07"], 
    [2020002, "03 07 18 25 30-02 07"], 
    ... ,
    [2020092, "08 11 17 31 35-07 11"]]
    
    
    import json
    
    filename = 'lottery.json' 
    with open(filename) as f:
        data = json.load(f)
    
  • "Cooking" step by step
  • Step 1: collect the 5 numbers

five_number=[]
for x in data:
    position=x[1].find('-')
    string_5=x[1][:position]
    list_5=string_5.split()
    five_number.append(list_5)
  • Step 2: frequencies

d={}
for x in five_number:
    for y in x:
        d[y]=d.get(y,0)+1
  • Step 3: sort and display

fre=[]
for x,y in d.items():
    fre.append([y,x])
fre.sort()
for x in fre:
    print(x)
  • Practice: do the same thing for the 2-number groups.

Summary

  • Strings, Dictionaries
  • Reading: Python for Everybody
    • Chapter 6, 9