Here are 3 ways to get the file paths that match specific patterns within a folder:
(1) Fetch all CSV files within the specified folder
import glob folder_path = r"C:\Users\Ron\Desktop\Test" csv_files = glob.glob(folder_path + r"\*.csv") print(csv_files)
Where:
- The csv files are stored in a folder called “Test“, and the path of that folder is:
folder_path = r”C:\Users\Ron\Desktop\Test”
Don’t forget to place the “r” letter before the path to avoid any unicode errors - The asterisk (“*”) is used as a wildcard to match any sequence of characters in file names within the specified Test folder
- The file extension .csv is used to obtain just the CSV files
The result is a list containing the full paths of 3 CSV files inside the “Test” folder:
['C:\\Users\\Ron\\Desktop\\Test\\Cars.csv', 'C:\\Users\\Ron\\Desktop\\Test\\Items.csv', 'C:\\Users\\Ron\\Desktop\\Test\\Products.csv']
(2) Fetch all the files starting with the word “image” and ending with .png:
import glob folder_path = r"C:\Users\Ron\Desktop\Test" image_files = glob.glob(folder_path + r"\image*.png") print(image_files)
Where:
- The image (.png) files are stored in a folder called “Test“, and the path of that folder is:
folder_path = r”C:\Users\Ron\Desktop\Test”
As before, don’t forget to place the “r” letter before the path - The “image*” is used to isolate just the file names that start with the word “image” (if, for example, there are additional image files that do not start with the word “image”, such as “landscape.png”, the paths of those images will be excluded)
- The file extension .png is used to obtain just the PNG image files
The result is a list that contains just the image files (.png) that start with the word “image“:
['C:\\Users\\Ron\\Desktop\\Test\\image_1.png', 'C:\\Users\\Ron\\Desktop\\Test\\image_2.png']
(3) Fetch all Python files (with a “.py” file extension) in a specific directory AND subdirectories:
import glob folder_path = r"C:\Users\Ron\Desktop\Test" python_files = glob.glob(folder_path + r"\**\*.py", recursive=True) print(python_files)
Here we have the paths of 3 Python files. The first one, called “my_program.py” is located in the root “Test” folder. And two additional Python files (“hello_world.py” and “import_data.py”) are located in a sub-folder called “python_files” inside the root “Test” folder:
['C:\\Users\\Ron\\Desktop\\Test\\my_program.py', 'C:\\Users\\Ron\\Desktop\\Test\\python_files\\hello_world.py', 'C:\\Users\\Ron\\Desktop\\Test\\python_files\\import_data.py']