Python Programming

Lecture 7 Variable Scope, File

7.1 Variable Scope

Variable Scope (变量的作用域)

In Python, module, class, def, lambda can introduce new variable scope, if/elif/else/, try/except, for/while will not introduce new variable scope.


if True:
    msg = 'I am from Shanghai'
print(msg) # We can use msg.

def test():
    msg = 'I am from Shanghai'
print(msg) # error

Global and Local(全局和局部)


msg_loc = "Shanghai" # Global
def test():
    msg = 'I am from Shanghai' # Local

New Assignment


msg = 'I am from Shanghai'
def test():
    msg = 'I am from Beijing'
    print(msg) 
test()
print(msg)

I am from Beijing
I am from Shanghai



Reference(引用)


msg = 'I am from Shanghai'
def test():
    print(msg) 
test()

I am from Shanghai


Referenced before assignment


msg = 'I am from Shanghai'
def test():
    print(msg) 
    msg = 'I am from Beijing' 
test() 

UnboundLocalError: local variable 'msg' 
referenced before assignment


  • How to modify the variable outside? The global keyword


num = 1
def fun():
    global num
    num = 123
    print(num)
fun()
print(num)

123
123


 


num = 1
def fun():
    print(num)
    global num  
    num = 123
    print(num)
fun()

SyntaxError: name 'num' is used 
prior to global declaration





a = 10
def test():
    a = a + 1
    print(a)
test()

UnboundLocalError: local variable 'a' 
referenced before assignment



a = 10
def test():
    global a 
    a = a + 1
    print(a)
test()

11





a = 10
def test():
    a = 10
    a = a + 1
    print(a)
test() 
print(a) 

11
10




mutable vs. immutable


a = [1,2,3]
def test():
    print(a)
    a = [1,2,3,4]
test() 

UnboundLocalError: local variable 'a' 
referenced before assignment



a = [1,2,3]
def test():
    print(a)
    a.append(4)
test() 
print(a)

[1, 2, 3]
[1, 2, 3, 4]




a = {"color": "green"}
def test():
    print(a)
    a["color"] = "red"
    a["position"] = "left"
test() 
print(a)

{'color': 'green'}
{'color': 'red', 'position': 'left'}




Parameter and Argument

  • For mutable objects (like lists and dictionaries), assigning a new value to a variable inside a function will not change the original variable outside the function. (the same with immutable objects!)


def changeme(mylist):
   mylist = [1,2]
   print("inside: ", mylist)
 
x = [10,20]
changeme(x)
print("outside: ", x)

inside: [1, 2]
outside: [10, 20]




  • However, modifying the contents of the mutable object (e.g., adding elements to a list or updating a dictionary) can change the value of the variable outside the function.


def changeme(mylist):
   mylist.extend([1,2])
   print ("inside: ", mylist)
 
x = [10,20]
changeme(x)
print ("outside: ", x)

inside: [10, 20, 1, 2]
outside: [10, 20, 1, 2]




7.2 File

Reading from a File

  • pi_digits.txt


3.1415926535
8979323846
2643383279

from pathlib import Path

path = Path('pi_digits.txt')
contents = path.read_text()
contents = contents.rstrip()
print(contents)

3.1415926535
8979323846
2643383279

File Path

  • relative path


path = Path('text_files/filename.txt') 
  • absolute path


path = Path('/home/eric/data_files/text_files/filename.txt') 

path = Path('c:/text_files/filename.txt') 
path = Path(r'c:\text_files\filename.txt')
path = Path('c:\\text_files\\filename.txt') 

path = Path('c:\text_files\filename.txt')  #error
  • Making a List of Lines from a File


from pathlib import Path

path = Path('pi_digits.txt')
contents = path.read_text()

lines = contents.splitlines()
print(lines)
  • Working with a File's Contents


pi_string = ''
for line in lines:
    pi_string = pi_string + line
print(pi_string)
print(len(pi_string))

['3.1415926535', '8979323846', '2643383279']
3.141592653589793238462643383279 # string
32

Writing to a File


from pathlib import Path

path = Path('programming.txt')

path.write_text("I love programming.")
  • Python can only write strings to a text file. If you want to store numerical data in a text file, you'll have to convert the data to string format first using the str() function.


from pathlib import Path


contents = "I love programming.\n"
contents += "I love creating new games.\n"
contents += "I also love working with data.\n"

path = Path('programming.txt')
path.write_text(contents)

I love programming.
I love creating new games.
I also love working with data. 

Encoding(计算机编码)


from pathlib import Path

path = Path('alice.txt')
contents = path.read_text(encoding='utf-8') 

Character Encoding: ASCII, Unicode, UTF-8, GBK

  • Unicode把所有语言都统一到一套编码里。Unicode标准也在不断发展,但最常用的是用两个字节表示一个字符(如果要用到非常偏僻的字符,就需要4个字节)。现代操作系统和大多数编程语言都直接支持Unicode。
  • UTF-8编码把一个Unicode字符根据不同的数字大小编码成1-6个字节,常用的英文字母被编码成1个字节,汉字通常是3个字节,只有很生僻的字符才会被编码成4-6个字节。
  • 在计算机内存中,统一使用Unicode编码,当需要保存到硬盘或者需要传输的时候,就转换为UTF-8编码。

chinese = '你好'.encode('utf-8')
print(chinese)     # 输出:b'\xe4\xbd\xa0\xe5\xa5\xbd' 
#这是字节串(是bytes,不是字符串!)用16进制是为了方便阅读,\x是16进制的前缀
print(len(chinese))  # 输出 6(每个汉字占 3 字节)

b'\xe4\xbd\xa0\xe5\xa5\xbd'.decode('utf-8') # 输出:你好
  • Two-dimensional code, QR code

import qrcode

img=qrcode.make("Hello!")
img.save("x.png")

import qrcode

img=qrcode.make("https://wangwanglulu.com/")
img.save("wl.png")

JSON

  • JSON (JavaScript Object Notation, pronounced /ˈdʒeɪsən/; also /ˈdʒeɪˌsɒn/) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.
  • JSON is a language-independent data format. It was derived from JavaScript, but many modern programming languages include code to generate and parse JSON-format data. JSON filenames use the extension .json.
  • json.dumps() and json.loads()

from pathlib import Path
import json


numbers = [2, 3, 5, 7, 11, 13]

path = Path('numbers.json')
contents = json.dumps(numbers)
path.write_text(contents)

from pathlib import Path
import json

path = Path('numbers.json')
contents = path.read_text()
numbers = json.loads(contents)

print(numbers)

[2, 3, 5, 7, 11, 13]

Reading Multiple Lines in JSON file


from pathlib import Path
import json

path = Path('yelp_sample.json')
contents = path.read_text()          # 读取整个文件为一个字符串
lines = contents.splitlines()        # 按行拆分,每一行是一个独立的 JSON 对象字符串
yelp = []
for line in lines:
    yelp.append(json.loads(line))    # 把每一行的 JSON 字符串解析成字典对象
    
print(len(yelp))
print(yelp[100])

Reading Multiple text file


import os
from pathlib import Path
from textblob import TextBlob #pip install

folder_path = "marvel/"
summary = []

for filename in os.listdir(folder_path):  #列出所有文件 
    # 组成相对路径
    path = Path(os.path.join(folder_path, filename))
    contents = path.read_text(errors="ignore") #不确定文件编码但是不报错
    lines = contents.splitlines()
    scores = []

    for line in lines:
        line = line.strip()
        if line:
            # 看看这一行话的情绪是正面、负面,还是中性(+1到-1)
            polarity = TextBlob(line).sentiment.polarity
            scores.append(polarity) 
    if scores:
        average = sum(scores) / len(scores)
        summary.append((average, filename))

summary.sort()
for score, movie in summary:
    print(movie, score)

Exercise

  1. 读取《权力的游戏第一季》的所有台词,并将所有台词文字放到一个列表中。下载:权力的游戏第一季台词json文件
  2. 分析权游第一季台词的情绪是正面、负面,还是中性(+1到-1)

Summary

  • Functions
    • Reading: Python for Everybody, Chapter 4
    • Reading: Python Crash Course, Chapter 8, 10