How to find the size of a directory, using Hadoop?

0 167

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. In this journal entry, we will try to answer the question “How to find the size of a directory using Hadoop?”

Let’s have a look at how we can look and get the size of the directory created in Hadoop.

List the Files

hadoop fs -ls /path/to/directory

This command displays the list of files in the hadoop directory structure [/path/to/directory] and all its details.

Read More: Hadoop Files List command options

Size of the Directory

hadoop fs -du -s -h /path/to/directory

This command displays the total size of the current directory. However, if you are looking for the breakup, then use the below command.

hadoop fs -du -s -h /path/to/directory/*
Read More: Hadoop Directory Size

This command will show the breakup size [size of subdirectories] out of the total size of the current directory.

Top 12 Interesting Careers to Explore in Big Data