Opening files:

  • open(path, mode, encoding=...) returns a file object
  • Always use a with block — it closes the file automatically, even on error
  • Modes: r read, w write (truncate), a append, x create-only, b binary (e.g. rb), + read/write
with open("data.txt", "r", encoding="utf-8") as f:
    content = f.read()

Reading text:

  • read() loads the whole file as one string; readlines() gives a list of lines (keeping \n)
  • Prefer iterating the file object for large files — it streams line by line instead of loading everything
with open("data.txt") as f:
    whole = f.read()            # entire file as one string
    # or
    lines = f.readlines()       # list of lines, each ending in "\n"
 
with open("data.txt") as f:     # memory-friendly: stream line by line
    for line in f:
        process(line.rstrip("\n"))

Writing text:

with open("out.txt", "w", encoding="utf-8") as f:
    f.write("one line\n")
    f.writelines(["a\n", "b\n"])
    print("via print", file=f)

Parsing common formats:

  • CSV → stdlib csv module (handles quoting / embedded commas correctly)
import csv
with open("data.csv", newline="") as f:
    for row in csv.reader(f):       # row is a list of strings
        ...
    # with a header row:
    for row in csv.DictReader(f):   # row is a dict keyed by column name
        ...
  • JSON → json module
import json
with open("data.json") as f:
    data = json.load(f)         # file -> Python object
with open("out.json", "w") as f:
    json.dump(data, f, indent=2)
 
data = json.loads(text)         # string -> object ; json.dumps(obj) for the reverse
  • Whitespace / columns → str.split() and str.strip()
with open("nums.txt") as f:
    rows = [list(map(int, line.split())) for line in f]

Paths (pathlib):

from pathlib import Path
p = Path("data") / "file.txt"
text = p.read_text(encoding="utf-8")   # one-shot read
p.write_text("hello")                  # one-shot write
p.exists(); p.suffix; p.stem
list(p.parent.glob("*.csv"))           # find files by pattern

Gotchas:

  • Always pass encoding="utf-8" for text — the default is platform-dependent
  • Use newline="" when opening files for the csv module (avoids blank rows on Windows)
  • read() / readlines() load the whole file into memory — iterate for large files
  • Binary mode (rb) returns bytes, not str
  • A file object is exhausted after one full read — re-open() or f.seek(0) to read again