Skip to content

Latest commit

 

History

History
119 lines (86 loc) · 2.26 KB

README.md

File metadata and controls

119 lines (86 loc) · 2.26 KB

glob : Finding files in a directory

{:.no_toc}

* TOC {:toc}

Goal

We want to deal with many files in a directory. What is an easy way to get the filename in a directory?

Questions to David Rotermund

Creating test files

from pathlib import Path

Path("Testfile_1.mat").touch()
Path("Testfile_2.mat").touch()
Path("Testfile_10.mat").touch()
Path("Testfile_3.mat").touch()

Using glob in a for-loop

import glob

for filename in glob.glob("*.mat"):
    print(filename)
Testfile_1.mat
Testfile_2.mat
Testfile_10.mat
Testfile_3.mat

Using glob to create a list

import glob

list = glob.glob("*.mat")
print(list)
['Testfile_1.mat', 'Testfile_2.mat', 'Testfile_10.mat', 'Testfile_3.mat']

Sorting the filenames

import glob

list = sorted(glob.glob("*.mat"))
print(list)
['Testfile_1.mat', 'Testfile_10.mat', 'Testfile_2.mat', 'Testfile_3.mat']

Hmmm... This result is not helpful.

Sorting the filenames with natsort

pip install natsort
import glob
from natsort import natsorted

list = natsorted(glob.glob("*.mat"))
print(list)
['Testfile_1.mat', 'Testfile_2.mat', 'Testfile_3.mat', 'Testfile_10.mat']

rsplit

And maybe you don't want to have the file extensions. Then we can use rsplit on the string.

import glob
from natsort import natsorted

for filename in natsorted(glob.glob("*.mat")):
    print(filename.rsplit(".", 1)[0])
Testfile_1
Testfile_2
Testfile_3
Testfile_10

Alternatively without a for-loop but using map , list and lambda functions:

import glob
from natsort import natsorted

filenames = natsorted(glob.glob("*.mat"))
filenames = list(map(lambda s: s.rsplit(".", 1)[0], filenames))
print(filenames)
['Testfile_1', 'Testfile_2', 'Testfile_3', 'Testfile_10']