Posts

Showing posts from April, 2024

Pyspark code-1:How to delete all the csv files from databricks file system

Image
#All the files to be deleted under a particular path starts with part and ends with.csv  #Pyspark code-1:How to delete all the csv files from databricks file system Code  import os # Define the directory path directory_path = "/FileStore/tables/Sample DataSource/" # Get a list of all CSV files recursively in the directory #List out all the files under this directory csv_files = dbutils.fs.ls(directory_path) # Delete each CSV file for file in csv_files:     if file.name.startswith("part") and file.name.endswith(".csv"):         dbutils.fs.rm(file.path, recurse=True) In Databricks, when you use the dbutils.fs.ls() function to list files in a directory, it returns a list of FileInfo objects. Each FileInfo object represents a file or directory in the specified location. The file.name attribute of a FileInfo object contains the name of the file or directory. Here's an explanation of file.name : file : This is a variable representing a single i...