Python glob module finds all pathnames matching a specified pattern according to the rules used by the Unix shell. This module comes built-in with Python and no need to install any external modules. It also allows to use wildcards *, ?, [range] to make it simple. Most common use case where glob is used to find recursively files in Python.
Let us take folder tree as below and use the glob function to filter files.
$ tree . ├── a.txt ├── b.txt ├── folder1 │ ├── c.txt │ └── d.py └── folder2 └── e.sh 2 directories, 5 filesSample code to use glob module and recursively find matching files based on patterns.
import glob base_folder = "/tmp/TEST" # Using iglob to find all pathnames matching python or shell files file_list = [] types = ('**/*.py','**/*.sh') for type_str in types: pattern = base_folder + "/" + type_str for file in glob.iglob(pattern, recursive=True): file_list.append(file) print(file_list) # Using glob to find all files print(glob.glob('/tmp/TEST/**/*', recursive=True)) # Using glob to find all files using range print(glob.glob('/tmp/TEST/**/[a-c]*', recursive=True))Output is:
['/tmp/TEST/folder1/d.py', '/tmp/TEST/folder2/e.sh'] ['/tmp/TEST/folder2', '/tmp/TEST/b.txt', '/tmp/TEST/a.txt', '/tmp/TEST/folder1', '/tmp/TEST/folder2/e.sh', '/tmp/TEST/folder1/d.py', '/tmp/TEST/folder1/c.txt'] ['/tmp/TEST/b.txt', '/tmp/TEST/a.txt', '/tmp/TEST/folder1/c.txt']Additional notes on glob.
- iglob returns an iterator without actually storing all the results simultaneously
- Using the '**' pattern in a large directory may consume inordinate amount of time
Few common patterns.
- '**/*' - All files and folders including sub folders
- '*' - All files and folders at top level
- ''*/*' - All first level files and folders
- '**/*.py' - All Python files in all folders
0 comments:
Post a Comment