Recently, I started learning Hadoop and practised MapReduce jobs on a linux distribution called CDH (Cloudera Distribution Including Apache Hadoop). It is an excellent VM loaded with Hadoop Ecosystem applications. This VM has CentOS as the operating system installed on it.
In this tutorial, I will walk through how to SSH the VM and run commands from your local terminal other than performing all the task inside the VM. I started to opt this methodology for practicing Hadoop because my VM used to freeze at times while working on Eclipse inside it (Due to limited RAM on my Machine).
To connect the VM with SSH, we require three things :
i) OpenSSH installed in the VM
ii) Firewall Disable in the VM
iii) Network Configuration of VirtualBox/VMware
Step I : Installing OpenSSH
Open terminal in the VM. Login in as root and install OpenSSH using following command:
Once, OpenSSH has been installed, we can add this service to the startup by running following commands:
Finally, we can check if can check if OpenSSH is running on port 22 or not by following command.
Step II : Disable Firewall
The firewall can be disabled using following on the VM terminal :
Step III : VirtualBox configuration
In the VirtualBox where the CDH VM image will be setup must have following Network settings.
Copy file from VM to local:
Compile your JAR/Class files from local system. Copy them to VM using scp and run the Hadoop job by logging into the VM from terminal using ssh and executing commands in the VM.
In this tutorial, I will walk through how to SSH the VM and run commands from your local terminal other than performing all the task inside the VM. I started to opt this methodology for practicing Hadoop because my VM used to freeze at times while working on Eclipse inside it (Due to limited RAM on my Machine).
To connect the VM with SSH, we require three things :
i) OpenSSH installed in the VM
ii) Firewall Disable in the VM
iii) Network Configuration of VirtualBox/VMware
Step I : Installing OpenSSH
Open terminal in the VM. Login in as root and install OpenSSH using following command:
yum -y install openssh-server
Once, OpenSSH has been installed, we can add this service to the startup by running following commands:
chkconfig sshd on
service sshd start
Finally, we can check if can check if OpenSSH is running on port 22 or not by following command.
netstat -tulpn | grep :22
Step II : Disable Firewall
The firewall can be disabled using following on the VM terminal :
service iptables stop chkconfig iptables off
Step III : VirtualBox configuration
In the VirtualBox where the CDH VM image will be setup must have following Network settings.
Now, restart the Virtual Machine.
Once Virtual Machine is started, open terminal application inside it and run ifconfig on it.
It will provide you an IP address. Note down the IP Address.
Finally, run SSH command from your Host System to this IP address and login into the system.
ssh username@VM_IP_ADDRESS
To copy files from local system to virtual machine, use scp command from terminal in local system.
Copy file from local to VM:
scp local_file username@VM_IP_ADDRESS:/Directory_Location_on_VM
scp username@VM_IP_ADDRESS:/File_location_on_VM/file_name
Compile your JAR/Class files from local system. Copy them to VM using scp and run the Hadoop job by logging into the VM from terminal using ssh and executing commands in the VM.