There is one built-in command, dbFileList , that creates a detailed list of the folders/files under a given user defined starting folder, and places this list in a database table. I find it a powerful capability when doing mass actions on the folder/file system.
The structure of the created table is:
- path – path name of the current folder/file, without the name of the current folder/file, and terminated by a “\” character
- name – name of the current folder/file
- ext – for files, this is the file extension in lowercase!
- attribute – a decimal integer representation of the file’s attributes
- type – a string , = “folder” or “file” derived from the attribute column!
- size – size of the file in bytes
- creation – creation date in the format “yyyy-mm-dd hh:mm”
- access – last access date in the format “yyyy-mm-dd hh:mm”
- write – last write date in the format “yyyy-mm-dd hh:mm”
- level – integer showing the level of folder/subfolder nesting
- folderkey – a integer which has a different value for each folder/subfolder
- hash – the result of the hash digest/signature operation
dbFileList
A built-in command for writing the results of a SQLite query to an Excel sheet.
Parameter 1: OPTIONAL <string expression defining a valid SQL schema>
Specifies the database schema in to which the resulting table is to be created. Default = “main”
Parameter 2: OPTIONAL <string expression specifing the name of the table to be created >
Specifies the database table in to which the resulting folder/file information is to be placed. Default = “myTable”.
If the specified table already exists it will be overwritten!
Parameter 3: <string expression: base/starting folder for the action>
There are 3 cases to be considered for this parameter:
- If it specifies a FILE then the information written to the table will only be for that specific file. i.e. the created table will have only 1 row!
- If it specifies a FOLDER, but is not terminated by the \ character then the information written to the table will only be for that specific folder . i.e. the created table will have only 1 row!
- If it specifies a FOLDER, and is terminated by the \ character then the information written will be based on the folders/files under the specified FODLER. i.e. the create table might contain many rows.
Parameter 4: OPTIONAL <integer expression specifing the number of folder levels to be included in this action> : default = -1 (= all)
This parameter is only relevant if parameter 3 specifies a FOLDER that is terminated by the \ character.
This parameter controls how many levels of nested folders is to be included in this action.
Parameter 5: OPTIONAL <string expression specifing a filter to be applied> : default = “*”
Technically this command uses the Microsoft Windows FindFirstFile (et al) API which supports a wildcard filtering capability, using * for a unknown character string, and ? for a unknown single character.
So, for example, to limit the action to only files with the extension jpg you could set the filter to be “*.jpg”.
Parameter 6: OPTIONAL <string expression specifing a hash algorithum> : default = “”
I think that this is a bit unusual capability but I have used it a nu,mber of times.
When the dbFileList command is running it can evaluate hash signature/digest the current file being examined.
The supported hash algorithms are:
- SHA512
- SHA384
- SHA256
- SHA1
- MD5
- MD4
- MD1
Be aware that generating a hash signature/digest is a relatively slow process. Maybe less that 10 files per second depending on file size and the hash algorithm selected.
So a mini-script to find the duplicate files under a given folder could be:
if ( ! dbOpen ) then
dbOpen ( "memory")
End IF
dim myPath = "c:\users\user\pictures\\"
dbFileList ( "temp", "files" , myPath , -1 , , "MD5")
drop table if exists temp.dupHash
create table temp.dupHash as select hash from temp.files where type ="file" group by hash having count (*) > 1
drop table if exists temp.duplicates
create table temp.duplicates as select * from temp.files where hash in ( select hash from temp.dupHash) order by hash
You will find a mini-script to delete all empty folders under a given starting point ns_delete_empty_folders here.