Search ZIP File using Java - NIO ZPFS Example

In this tutorial, we will explore the idea of searching a ZIP Archive for a specific file using Java, with an example program. Instead of the traditional java.util.zip ZipEntry approach, we will use NIO and ZPFS (Zip File System) to mount a zip file as a file system. After this we will use the walkFileTree method available in NIO to scan the ZIP file tree and locate the file we want. Sounds interesting? NIO offers superfast approaches to probe your Zip Archive. The different steps involved in this tutorial is captured  and explained in the sections to follow:

Searching using FileVisitor Interface:

In order to search any directory or file system recursively, the Java class we are writing should implement  / override the methods provided by FileVisitor interface. If you can recollect, we did the same during our recursive directory listing using NIO example also.

To query a ZIP archive, we have to override the visitFile method and examine the incoming entry from the ZIP File, to see if it matches the file we want to search. We continue if there is no match. Otherwise, we terminate the program for a match and provide where the file is located in the ZIP archive.


There is no need to override any other methods for this example.

Read File Name to be Searched:

We open a Path object in this step and accept the file that needs to be located in the ZIP file. In our case, the file we want to locate is dest.sql. This is available in two places in our sample ZIP file. Refer to the screenshot below:
Search ZIP File using ZPFS / NIO / Java - Example
Search ZIP File using ZPFS / NIO / Java - Example
One file is located in the root folder and the other one is located in “sp” folder. We want the program to spot both the locations and dump them in the output. Full code is provided at the end of this tutorial.


Create ZIP File System:

We have to create a ZIP File System in order to probe the file properly. We have discussed the steps to create ZIP File system quite a lot in this blog.  A code snippet to create ZPFS in Java is shown below:

        Map<String, String> zip_properties = new HashMap<>();
        zip_properties.put("create", "false");
        URI zip_disk = URI.create("jar:file:/sp.zip");
        FileSystem zipfs = FileSystems.newFileSystem(zip_disk, zip_properties);


Get root directories to Search:

Once the file system is created, we use getRootDirectories method to get a list of all root paths inside the ZIP file on which we have to search for the input file. We iteratively search all the root directories, folders and sub folders in them recursively, and locate the occurrence of our input file in them. Note that we match file only by name in this tutorial.

The search is done using the walkFileTree method available in java.nio.file.Files class. We use a public boolean variable to set a flag if a match is found. If no match is found at the end of the search, we output a message to the user that the searched file is not available in the ZIP archive.


Search ZIP Archive for File – Complete Java NIO ZPFS Program

The complete Java program that recursively searches the entire ZIP archive for a file is shown below:

import java.nio.file.attribute.BasicFileAttributes;
import java.nio.file.*;
import java.io.IOException;
import java.util.*;
import java.net.URI;

class Search implements FileVisitor {
    /* This flag holds the Path to the searched file */
    private final Path searchedFile;
    /* This flag is set to true if the file is found */
    public boolean file_found_flag;
  
    public Search(Path searchedFile) {
       this.searchedFile = searchedFile;
       this.file_found_flag = false;
    }
   
    
    @Override
    public FileVisitResult visitFile(Object file, BasicFileAttributes attrs)
    throws IOException {     
        Path incoming_file=(Path) file;
        Path name = incoming_file.getFileName();
        String filename=name.toString();
        String source=searchedFile.toString();
        if (name != null && filename.equals(source)) {
            System.out.println("ZIP File Contains " + searchedFile +
            " at " + incoming_file.toRealPath().toString());
            file_found_flag = true;
        }
        
        if (!file_found_flag) {
            return FileVisitResult.CONTINUE;
            } else {
           // Terminates search on first match. set this to CONTINUE to find all matches 
            return FileVisitResult.TERMINATE;
        } 
    }
    /* We don't use these, so just override them */
    @Override
    public FileVisitResult postVisitDirectory(Object dir, IOException exc)
    throws IOException {        
        return FileVisitResult.CONTINUE;
    }
    
    @Override
    public FileVisitResult preVisitDirectory(Object dir, BasicFileAttributes attrs)
    throws IOException {
        return FileVisitResult.CONTINUE;
    }
    @Override
    public FileVisitResult visitFileFailed(Object file, IOException exc)
    throws IOException {        
        return FileVisitResult.CONTINUE;
    }
    
    
    public static void main(String args[]) throws IOException {
        
        /* The file that needs to be searched inside the ZIP File */
        Path searchFile = Paths.get("dest.sql");
        Search walk = new Search(searchFile);        
        
        /* Define ZIP File System Properies in HashMap */
        Map<String, String> zip_properties = new HashMap<>();
        zip_properties.put("create", "false");
        URI zip_disk = URI.create("jar:file:/sp.zip");
        FileSystem zipfs = FileSystems.newFileSystem(zip_disk, zip_properties);
        
        Iterable<Path> dirs = zipfs.getRootDirectories();
        
        for (Path root : dirs) {
            if (!walk.file_found_flag) {               
                Files.walkFileTree(root, walk);
            }
        }
        if (!walk.file_found_flag) {
            System.out.println("ZIP File does not contain " + searchFile );
        }
    }
    
}

You need to change FileVisitResult.TERMINATE in visitFile method to CONTINUE if you want to list all the matches. The output of this program is shown below:

ZIP File Contains dest.sql at /dest.sql
ZIP File Contains dest.sql at /sp/dest.sq

Viola! You created a program to search the ZIP file in Java. You can enhance this program to perform a file search based on regular expressions or wild cards also. All inside a ZIP file. This NIO + ZPFS combination opens a plethora of searching capabilities to your zip file with very less amount of coding involvement. We will discuss more examples of searching ZIP file in the upcoming posts. If you have any questions on this post in the meantime, you can give us a shout through a comment.

No comments:

Post a Comment