Friday, August 29, 2014

Using netstat to grep pids and LISTEN ports

$sudo netstat -anp |grep LISTEN | grep -v "::"
tcp        0      0 0.0.0.0:38860               0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 0.0.0.0:57583               0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 0.0.0.0:2000                0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 0.0.0.0:8080                0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1054/sshd
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1161/master
tcp        0      0 0.0.0.0:57824               0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 0.0.0.0:8000                0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 127.0.0.1:8005              0.0.0.0:*                   LISTEN      1497/java
tcp        0      0 0.0.0.0:58535               0.0.0.0:*                   LISTEN      1497/java

Thursday, August 28, 2014

Changing font color in a linux system

In many of the linux ami's the default font color makes it hardly readable, so not sure why the default color set to "34" in the /etc/DIR_COLORS file. In order to change it you can do the following

$sudo vi /etc/DIR_COLORS
...
DIR 01;34       # directory
...

and you change it to cyan and it will improve the readability

...
DIR 01;36       # directory

After making the change, save and source the profile or exit out and re-login again to see the changes.

If you would like to set it to other colors, you may choose from

30=black
31=red
32=green
33=yellow
34=blue
35=magenta
36=cyan
37=white

Saturday, August 23, 2014

CloudBerry Explorer - UI alternative to s3cmd or AWS CLI commands

If you have storage on Amazon S3, then sooner or later you will needing a tool that can do entire folder backups to local disk or another cloud storage. Amazon S3 bucket allows you to download only specific files within the bucket at a time. CloudBerry Explorer has a free version and a pro version and I am happy to report that free does the job fairly well.

You would have to create a S3 bucket and create a IAM user with S3 "read-only" bucket policy like below:-

************
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "*"
        }
    ]
}
************

Next download and install CloudBerry Explorer and then configure the S3 user that you just created in the IAM.


Once you have set up, you can downloads entire buckets or sync with other storage options

Monday, August 18, 2014

Use 'bcrypt' algorithm for hashing passwords

bcrypt is an adaptive password hashing algorithm rather than a symmetric encryption algorithm. It is much slower than MD5 or SHA-1, but more secure. It take a file input and when decrypting, it can output to stdout

**********
$ date; bcrypt -c -s12 test.txt;date
Mon Aug 18 23:04:36 PDT 2014
Encryption key:Key must be at least 8 characters
Encryption key:
Again:
Mon Aug 18 23:04:47 PDT 2014

$ date; bcrypt -o test.txt.bfe;date
Mon Aug 18 23:05:38 PDT 2014
Encryption key:
this is a sample text
Mon Aug 18 23:05:41 PDT 2014

***********

No Amazon SES service in Tokyo region

I recently had a request from a customer Japan to enable SES service in Amazon's Tokyo region. However, we found that it was not possible when we ran into the error below:-



Below is the response from AWS Support:-

"Yes, that is correct, SES is not currently supported in Tokyo Region. I do not have a roadmap for it to be available in Tokyo but I will ask the SES Engineering team to see if they can give an idea of when it will be available. We dont have another service such as SES that would keep there data in Japan so if the requirement is to keep data in Japan then another SMTP service such Sendgrid may have a Japanese region."
....
"I have received word from our SES engineering team that there are no current plans to include Tokyo region for SES. Apologies that we cannot be more of help with this regard."

C'mon AWS, if SendGrid can do it, you can!!.

MySQL exception in RDS:com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'content' at row 1

With default mysql.5.5 RDS DB parameter group you may see an exception getting thrown with the stack below:-

***********
com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'content' at row 1
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4188)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4122)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2818)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2157)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2460)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2377)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2361)
***********

This is probably because MySQL parameter, "max_allowed_packet" is not set. You will have to create a custom db parameter group and then explicitly set the value for "max_allowed_packet" to something like "33554432". You will need to modify the RDS instance and then select "mydbgroup" as the parameter group then check "apply immediate" box. Once the parameter group takes effect, it will show status of "pending-reboot" and then you will have to reboot your RDS instance. 

Once the changes are in place, you can confirm by running the below MySQL commands

************
mysql> SELECT @@max_allowed_packet;
----------------------
@@max_allowed_packet
----------------------
33554432
----------------------
1 row in set (0.00 sec)
************

Alternatively, you can also run AWS CLI command:-

$aws rds-modify-db-parameter-group mydbgroup –parameters “name=max_allowed_packet,value=33554432,method=immediate

Saturday, August 9, 2014

RHEL Kernel tuning parameters for improved throughput handling


  • Before making any kernel parameter changes, make a backup $sudo sysctl -a >$HOME/oldsettings.txt
  • Edit /etc/sysctl.conf and add the below lines at the end of the file

==============
fs.file-max = 2203600
net.ipv4.ip_local_port_range = 1024 65535
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_tw_recycle= 1
net.core.netdev_max_backlog= 16384
net.ipv4.tcp_max_syn_backlog= 8192
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_rmem= 4096 87380 16777216
net.ipv4.tcp_wmem= 4096 16384 16777216
net.core.somaxconn= 4096
kernel.sched_migration_cost = 5000000
==============

  • Reload the updated kernel parameters by running $sudo sysctl -p
  • check the updated values are reflected in 
==============
$cat /proc/sys/net/ipv4/tcp_tw_reuse $cat /proc/sys/net/ipv4/tcp_tw_recycle
==============
  • check whether the parameters have been updated by running sysctl again with parameter name
==============
$sysctl <parameter name> 
e.g.: $sysctl net.ipv4.ip_local_port_range
$cat /proc/sys/fs/file-max
$cat /proc/sys/kernel/sched_migration_cost

==============
  • Edit /etc/security/limits.conf and add soft and hard limits for httpd ("daemon") and default "ec2-user"

==============
daemon soft    nofile   10000
daemon hard    nofile  50000
ec2-user hard    nofile  30000
ec2-user soft    nofile  50000
==============

  • No reboot of instance is required.

Apache httpd reverse proxy settings for improved performance

In httpd.conf, you can set the below under "Server Root" directives:-

ServerRoot "/opt/products/apache2"
StartServers 10
ServerLimit 150
MaxClients 150
MaxKeepAliveRequests 120
KeepAlive Off

Curl commands to check the performance of a site

$ for X in `seq 60`;do curl -Ik -w "HTTPCode=%{http_code} timeconnect=%{time_connect} timetransfer=%{time_starttransfer} Total_time=%{time_total}\n" https://www,example.com -so /dev/null; done
HTTPCode=200 timeconnect=0.192 timetransfer=0.556 Total_time=0.556
HTTPCode=200 timeconnect=0.065 timetransfer=0.404 Total_time=0.404
HTTPCode=200 timeconnect=0.065 timetransfer=0.413 Total_time=0.413


$ for X in `seq 60`;do curl -k -w "HTTPCode=%{http_code} timeconnect=%{time_connect} timetransfer=%{time_starttransfer} Total_time=%{time_total}\n" https:/www.example.com -so /dev/null; done
HTTPCode=200 timeconnect=0.071 timetransfer=14.218 Total_time=14.218
HTTPCode=200 timeconnect=0.071 timetransfer=16.716 Total_time=16.716
HTTPCode=200 timeconnect=0.065 timetransfer=19.961 Total_time=19.962


$ curl -vv --insecure --o /dev/null -s -w %{time_connect}:%{time_starttransfer}:%{time_total} 'https://www.example.com'

Tuesday, August 5, 2014

Can't SSH into one machine running in Tokyo region from jump host

I ran into a strange issue were one of the machines running in an Amazon VPC in ap-northeast-1b zone can't be SSH'ed into from jump host in our data center. However, another NAT instance in the same CIDR block can be SSH'ed into from the same jump host.

Timeout from jump host:-

$ ssh -i key.pem user@ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com
ssh: connect to host ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com port 22: Connection timed out

After checking the security group to make sure that the correct natted external ip address is allowed for ingress port, a simple nmap command showed that port was not in "open" status

$ nmap ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com -p 22 -P0
Starting Nmap 5.21 ( http://nmap.org ) at 2014-08-05 10:16 PDT
Nmap scan report for ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com (54.x.x.x)
Host is up.
PORT   STATE    SERVICE
22/tcp filtered ssh

Nmap done: 1 IP address (1 host up) scanned in 13.09 seconds

Next is to check wether there are valid routes to that particular machine. In order to run "traceroute", you will need to request "sudo" privilege from your jump host administrator

$ sudo traceroute -T -p 22 54.x.x.x
[sudo] password for user:
traceroute to 54.x.x.x (54.x.x.x), 30 hops max, 60 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *

Since there are no routes available, most likely cause will be firewall behind which the jump host sits is blocking connections to this particular instance. To confirm you can run packet capture such as

$sudo tcpdump -nnvv host 54.x.x.x -w capture.pcap

Now you can try the SSH again and then inspect the packets like

$ sudo tcpdump -r capture.pcap
[sudo] password for user:
reading from file capture.pcap, link-type EN10MB (Ethernet)
23:52:41.715663 IP jumphost.54540 > ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com.ssh: Flags [S], seq 310869746, win 14600, options [mss 1460,sackOK,TS val 1367537756 ecr 0,nop,wscale 7], length 0
23:52:44.715767 IP jumphost.54540 > ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com.ssh: Flags [S], seq 310869746, win 14600, options [mss 1460,sackOK,TS val 1367540756 ecr 0,nop,wscale 7], length 0
23:52:50.715717 IP jumphost.54540 > ec2-54-x-x-x.ap-northeast-1.compute.amazonaws.com.ssh: Flags [S], seq 310869746, win 14600, options [mss 1460,sackOK,TS val 1367546756 ecr 0,nop,wscale 7], length 0

As we can see about the 3-way TCP handshake does not proceed further than SYN packet sent from the jump host but there is not SYN ACK followed by ACK. So the connection is not established. Now we have enough information to talk to the relevant folks in IT that manage the jump host.

Monday, August 4, 2014

Quick rsync command to move files to bastion host/jump host behind MFA token

$ rsync -v -e ssh key.pem user@jumphost:~
Verification code:
Password:
key.pem
sent 1780 bytes  received 31 bytes  61.39 bytes/sec
total size is 1692  speedup is 0.93

Once again problems in AWS Tokyo region, the api's were not returning

The Tokyo region of AWS is fairly new and seems to be having a fair share of problems with stability. Today morning, there were api failures and console were not showing VPC's as below:-


AWS Support confirmed that there was a problem and their response is below:-

"We're aware that there was a problem in the Tokyo region where we had problems with the API calls that were used to gather this data. If you refer to our Service Health Dashboard (http://status.aws.amazon.com/) you'll see that we had a couple of postings on the issue. For your reference, those were:

10:37 AM PDT We are investigating increased API error rates and latencies for the EC2 APIs in the AP-NORTHEAST-1 Region.
10:44 AM PDT Between 10:13 AM and 10:34 AM PDT we experienced increased error rates and latencies for the EC2 APIs in the AP-NORTHEAST-1 Region. The issue has been resolved and the service is operating normally."

Saturday, August 2, 2014

Testing NodeQuery linux server monitoring in public beta phase

I have been trying out NodeQuery monitorning service for linux servers for the past few months and I am finding it fairly intuitive and simple to use. Their web interface is well laid out and easily navigable. However, I am not sure if the uptime metrics they report is accurate. The reason is I got an email alert today morning about a particular server being not reachable:-

**********
From: NodeQuery <hello@nodequery.com>Date: Sat, Aug 2, 2014 at 10:13 AM
Subject: [ALERT] server A is not responding
To:  <user@example.com>


Hello user,

it seems one of your servers is not responding anymore.

Server: server A
Last Update: 2014-08-02 10:12:12
Alert Trigger: 2014-08-02 10:12:01

If you don't want to receive alerts anymore, log into your account and edit the
notification settings for your server.

Feel free to reply to this message if you are experiencing problems with our
services.

Thanks,
NodeQuery.com

**********

I logged into the server and was able to verify that the server was indeed up at that time interval. Consequently, the uptime reported on NodeQuery dashboard was 99.88%. This percentage translates to approximately 50 mins of downtime per month or 9.5 hours of downtime per year. I am pretty certain that this server was not down for that long. So it looks like even temporary network glitches could be the bane of agent based monitoring systems, where agent fails to communicate with server and uptime reports tend to get skewed.

However, I like their single command to uninstall their agent:-

$sudo rm -R /etc/nodequery && (crontab -u nodequery -l | grep -v "/etc/nodequery/nq-agent.sh") | crontab -u nodequery - && userdel nodequery

VPC creation in Tokyo region failing for ap-northeast-1a zone

I was trying to create a VPC with public private subnets and VPC creation was failing because I had selected availability zone of the public subnet to "ap-northeast-1a". The error message something like below:-


I got the below message from Amazon technical support regarding the above error:-

"VPC Creation Failed:
There was an error creating your VPC: Value (ap-northeast-1a) for parameter availabilityZone is invalid. Subnets can currently only be created in the following availability zones ap-northeast-1c, ap-northeast-1b." I researched this issue and it appears that that AZ is at capacity and all new creations in this AZ will not succeed. Currently only the two AZ's noted in the error above are able to create new subnets in. ap-northeast-1c, ap-northeast-1b. As explained on the call each AZ (A, B, C) for each customers' account can vary to which data center they map to. This means that while Customer X cannot create subnets in AZ A, Customer Y may be able to if his AZ A maps to a different data center, but will be unable to create subnets in AZ B if it maps to the same datacenter as customer X's AZ A. I do agree that it is very inconvenient to be presented with an option that is not able to be used. I am going to reach out to the console team that only AZs that are able to have subnets created in them, be presented in the subnet Wizard. I also was unable to find documentation explaining this. Since I firmly believe that explanations should be available for this without contacting support, I am going to reach out to our documentation team to create a document explaining why these situations arise. Unfortunately since both of the above are feature request, I cannot comment on if or when these changes will be implemented due to internal regulations."