Naresh Kumar a.k.a tEqUiL@

Friday, October 3, 2008

have u ever wondered ..

This post is little sidetrack to a topic which appears to be completely obsolete in prespective of an ordianary techie. But there could be instances where same techie (a human being i guess) might have experienced the same set of internal emotions. Emotions which cannot be described .. Even if they could be , my techie guy would not describe them.. These are those emotions which don't find any space or time for getting expressed in a normal day. Although the fact that they all come and go at very very normal days ,,, they catch u unprepared ..In fact as a matter of fact these are one of those things in life for which u can never prepare urself .. Even if u do pretend to be ready for the sudden and unexpected burst of those emotions u will never be able to tackle them the way u might have planned .. they just come and leave watermarks on ur memory .. watermarks .. they are always there. its just that u can't see them with ur naked eye .. u need to apply some really complex transformations to get this watermark out of ur head. but if that had been easy ,,There would have been an IEEE paper on that .. fortunately or unfortunately (I don't kno wat) but its not .. there is no research paper or a webpage till date that could give some information about getting those emotions out of ur head .. Atleast no webcrawler so far has visited that kind of page.. I am writing some thing which is not making any sense at all .. But sometimes u may understand me .. this is wat comes out when u watermark just everything that has happened in ur life and keep the key to those watermarks to urself .. U need to learn how to share that key .. share the key of ur watermarks with the person u love.. or pretend to love .. but do take care of the fact that person loves u back .. otherwise he/she can just go and hide that key in one of her watermarks and don't wish to share his/her key with you ... This might seem normal or just something that u have backup plan for .. But trust me this is the kind of situation that results in embedding of those watermarks and these watermarks are more robust and resilent to attacks than all those earlier ones which u had intentions of sharing ... But these kind of emotions are so secured and encrypted that even a decoding code that could break 256-bit encryption cannot help you out.. who decides the level of encryption of these emotions .. I guess its the willingness to be alike ... its the desire to fit in . nobody wants to be different in those terms which are not socially acceptable .. so we have reached up to the issue of social acceptance .. I really don't know what i am writing .. but social acceptance is something on which i would certainly love to comment but will not . because i don't want my society to disregard me .. see another example of watermarking the emotions, thoughts, ideas and comments just because of a silly greed for being more nd more socially acceptable.. in abstract sense this piece of writing doesn't make any sense .. But it may lead to some insight into those who go ahead and took the pain of going thru this article after having atleast 3 beers and a couple of ciggrettes .. but don't bother .. It was all crap .. i intentionally wrote this to confuse my self.

Thursday, February 8, 2007

How to Port Linux on Arm 9

I did it in my Embedded systems Project.

here are exact steps to be followed:

Porting of Linux onto the ARM9 board

VARIOUS ASPECTS OF PROTING LINUX ON ARM 9 BASED BOARD : -

1. Board:- SMDK2410 has S3C2410 microprocessor based on arm9 architecture.
SMDK2410 for S3C2410 is a platform that is suitable for code development of SAMSUNG's S3C2410, 16/32-bit RISC microcontroller (ARM920T) for hand-held device and general applications.

It shows the basic system-based hardware design which uses the S3C2410. It can evaluate the basic operations of the S3C2410X01 and develop codes for it as well.

SMDK2410 (Samsung MCU Development Kit) consist of S3C2410, boot EEPROM (Flash ROM), SDRAM, LCD interface, two serial communication ports, configuration switches, JTAG interface and status LEDs.

2. Overview: - SMDK 2410 is a quite limited resource device. The functions that we require on our board after porting Linux on to it :-

- Webcam support
- Serial port API
- Some memory for handling files up to 50 KB and image compression code

We are using 2.6 kernels for our Port.
For the time being we are booting the Linux from the RAM of JTAG. In near future we will try to implement the booting from Flash memory or any other non volatile memory.

Boot loader we will be using is vivi.

3.Steps on Linux Porting:-

1. Installing cross compilers :- Since architecture of normal PC is different from that of target board we need to cross compile our kernel so that it can run on target.
We installed arm Linux tool chains for compiling the kernel for the arm.

Following files were used to install cross platform tool chains:-

2.6.10-at91.patch.gz
26_at91_serial.c.gz
binutils-2.16
binutils-2.16.tar.gz
flow.c.diff
gcc-3.4.4.tar.bz2
glibc-2.3.5.tar.gz
ioperm.c.diff
linux-2.6.10.tar.gz
t-linux.diff

Building toolchain:

gcc-core only contains the C compiler, if you want other languages you can download the sources separately or you can use the full gcc sources.

I downloaded to /usr/local/src/arm-elf-tools/src, then I used the following script:

#! /bin/sh
# Variables
export target=arm-elf
export prefix=/usr/local/src/arm-elf-tools
export PATH="$prefix/bin":"$PATH"
# Unpack
cd $prefix/src
tar jxf binutils-2.17.tar.bz2
tar jxf gcc-core-4.1.1.tar.bz2
tar zxf newlib-1.14.0.tar.gz
tar jxf gdb-6.5.tar.bz2
# Build
cd $prefix
mkdir build-binutils
cd build-binutils

../src/binutils-2.17/configure --target=$target --prefix=$prefix -v
make all install
cd ..
mkdir build-gcc
cd build-gcc
../src/gcc-4.1.1/configure --target=$target --prefix=$prefix -v \

--with-arch=armv4t --with-cpu=arm7tdmi --with-newlib \

--enable-languages=c \

--enable-static --disable-shared --disable-thread \

--disable-libssp --disable-libmudflap

make all install

cd ..

# newlib wants arm-elf-cc but there is only arm-elf-gcc, so we make a link

cd bin

ln -s arm-elf-gcc arm-elf-cc
cd ..
mkdir build-newlib
cd build-newlib

../src/newlib-1.14.0/configure --target=$target --prefix=$prefix -v

make all install

cd ..

cd build-gcc

# second make run for gcc

make all install

cd ..
mkdir build-gdb
cd build-gdb

../src/gdb-6.5/configure --target=$target --prefix=$prefix -v
make all install

2. Compiling Kernel:- We are building the kernel 2.6.10. We used arm-elf-gcc as compiler for our kernel. Instructions for compiling the kernel were available with the kernel source package that we downloaded. The kernel was compiled by using following command:-

-make menuconfig

# for configuring the kernel on our own rather that accepting the default configuration.

-make zImage
# for making kernel image

Note Make proper directory structure for building the kernel

3.Next Step :- Porting the custom built Linux on the target board using Jtag. .. Easy step but we are waiting due to non availability of JTAG till now.

4. Also configuring the boot loader program for booting from kernel image.

Thursday, January 4, 2007

IP Masquerading with Linux

It seems everyone wants on the Internet nowadays, and for good reason. There is plenty of information to obtain, people to send e-mail to, web pages to look at and software to download. Besides that, businesses are finding acceptable means of advertising, and in some cases, selling merchandise. But with all the rush to get on the Internet, people are finding Internet addresses are not as readily available as they once were. Some network administrators are experiencing that in many environments; they don't have enough network addresses to meet the demand.

Instead of going through the motions of obtaining another block or two of class C addresses, some administrators hide a set of unregistered addresses behind a network address translation (NAT) device. The Internet is prepared for these ``private'' addresses, and blocks of addresses are reserved for this purpose. RFC 1597 specifies the addresses 10.0.0.0 through 10.255.255.255, 172.16.0.0 through 172.31.255.255, and 192.168.0.0 through 192.168.255.255 to be used in these instances.

The RFC strongly recommends that if you, as a network administrator, are going to use a private address, you should select addresses from the ranges given. One notably important reason is that if a packet happens to pass through the NAT with its original IP address intact, the backbone routers on the Internet will not forward the packet. If, instead, you were using someone else's valid IP address, confusion could occur.

Many firewalls, especially those based on application proxy gateways, naturally hide addresses because of how they function. It is no surprise that Linux can also support address hiding through what is called ``IP masquerading''. Setting up masquerading under Linux is not terribly difficult, but there are some subtleties to point out.

Getting Ready
If you are running kernel version 1.2.x, you need to obtain the kernel patch to support masquerading. The patch is available from ftp://ftp.eves.com/pub/masq, or you can download everything you need from http://www.indyramp.com/masq/. IP masquerading is supported with 1.3.x kernel versions. For this article, I was running version 1.3.56, and all examples are based on this version. For FTP support (mentioned later), you need to have at least kernel version 1.3.37. There is a patch for 1.2.x (where x >= 4) kernels to support FTP, but I haven't tested it yet. The masqplus-0.4 ``jumbo'' patch that is available from Indyramp fixes a few bugs and adds support for FTP, RealAudio, and fragmentation for 1.2.13 kernels.

When configuring the kernel to support masquerading, it is important to also say yes to firewall and forwarding support. Here are the parameters I used for configuring my kernel:

Network firewalls (CONFIG_FIREWALL) [Y/n/?] y
Network aliasing (CONFIG_NET_ALIAS) [Y/n/?] y
TCP/IP networking (CONFIG_INET) [Y/n/?] y
IP: forwarding/gatewaying (CONFIG_IP_FORWARD) [Y/n/?] y
IP: multicasting (CONFIG_IP_MULTICAST) [Y/n/?] y
IP: firewalling (CONFIG_IP_FIREWALL) [Y/n/?] y
IP: accounting (CONFIG_IP_ACCT) [Y/n/?] y
IP: tunneling (CONFIG_NET_IPIP) [Y/m/n/?] y
eP: firewall packet logging (CONFIG_IP_FIREWALL_VERBOSE) [Y/n/?] y
IP: masquerading (ALPHA) (CONFIG_IP_MASQUERADE) [Y/n/?] y

I chose other items not directly related to masquerading such as multicast and tunneling, but I like to have fun.

Notice the IP masquerading software is still considered to be Alpha-quality. This means there are probably still some bugs. The base functionality is there, but not all of the nuances of TCP, UDP, and IP, nor the application protocols, have been thoroughly tested. In addition, the interface may still change as development proceeds.

In order to manipulate the masquerading ruleset, you will need the ipfw software version 1.3.6-BETA3, or you can obtain a precompiled binary from ftp.eves.com. Those who use Linux as a filtering firewall and also use ipfwadm should note that software does not yet support IP masquerading, so ipfw is necessary. [New: ipfwadm 2.0beta2, now available for Linux 1.3.66 and newer from ftp://ftp.xos.nl/pub/linux/ipfwadm/, does support masquerading. Also, it is necessary to use recent versions of ipfwadm with the most recent versions of the kernel due to interface changes.--ED]

Applying the Rules
Let's first define what we're trying to accomplish and see how IP masquerading is useful in the environment. Figure 1 shows the networks on which the examples are based. deathstar is the Linux machine employing masquerading in order to hide the network 192.168.1.0.

Masquerading is useful in our architecture because it saves us a little administrative hassle. A number of people in my department have home LANs, and through their PPP connection they can use their other machines to connect to the department lab. We could easily run a routing protocol, like RIP, to make the machines on the lab network aware of the home LANs, but that would take some coordination about who has what network address. It is easier (for us) to use masquerading.

To hide the network, we can issue the command:

# ipfw a m all from 192.168.1.0/24 to 0.0.0.0/0

This rule indicates that we want to add a masquerading rule for all protocols (which in this case means TCP and UDP). The network we are hiding is 192.168.1.0, and we are hiding connections going to any network (0.0.0.0/0). The /24 indicates we are applying a 24-bit netmask, or 255.255.255.0. Since we specified the network as 192.168.1.0, deathstar will masquerade for all hosts on the network. That's all we need to do.

If I had only wanted deathstar to masquerade for enterprise, then I would have typed in:

# ipfw a m all from 192.168.1.2/32 to 0.0.0.0/0

But what does it really mean ``to masquerade for''? Well, let's examine the affected files and kernel tables for a typical masqueraded connection. We'll use telnet for our example.

Let's verify the rule has been set. We need to look at the ip_forward file in the /proc/net directory. We can use ipfw to do this:

# ipfw -n list forward
Type Proto From To Ports
(masqueradeall 192.168.1.0/24 anywhere

This is good. Some administrators mistakenly look in the /proc/net/ip_masquerade file for the rule and when they don't see it, confusion sets in.

For our example, I've started a telnet session from warbird to enterprise. Also, on mccoy, I'm using the tcpdump program to monitor the traffic on 20.2.51.0 and sparcbook to monitor the traffic on 192.168.1.0. We can now look at the ip_masquerade file to examine what is happening (see Listing 1).

Let's decode this stuff. First, the earliest packet is at the bottom. It is a DNS request (therefore UDP) from 192.168.1.2 to 20.2.51.2. mccoy is warbird's DNS server in this case. The Masq column shows us the port on deathstar that is used for the masquerading. For the first DNS request, it is port 60000 (EA60). After the DNS resolution, the TCP connection is established on the next available port over 60000, 60001. Figure 2 illustrates the protocol time-line for the sequence of events up to the TCP open.

Even though the protocol time-line shows how the packets really traverse, the sending and receiving nodes are unaware of this. Hence, the reason they call it masquerading. From warbird's point of view, the traffic will look exactly as expected. That is, packets from enterprise are repackaged by deathstar to look as if they came from enterprise. Listing 2 shows the tcpdump output of the traffic on the 192.168.1.0 network for the telnet session.

Listing 3 shows the protocol traffic on the 20.2.51.0 network during the telnet session. Notice that information originates from deathstar, not warbird. (Another thing you might notice is I don't keep the clocks synchronized very well.)

Another important aspect is maintaining the TCP synchronization numbers. For masquerading to work properly, deathstar must keep the synchronization correct. The TCP sequence number generated by warbird is forwarded by deathstar rather than a new sequence number being generated.

Some final observations about the contents of the /proc/net/ip_masquerade file pertain to the last four fields. The Init-seq, Delta, and PDelta fields deal with the TCP synchronization numbers when ftp data transfers (more in a minute) occur, and the last field is the expiration timer on the masquerade entry. The time is kept in hundredths of seconds; TCP is given 90000 or 15 minutes, and UDP is given 300000 or 5 minutes. As long as traffic is being passed between the two communicating hosts for the masked port, the timer will remain updated. A minor detail about the expiration timer has to do with FTP transfers. FTP uses two connections: a control connection for commands and a data connection for a file transfer. While the data connection is in use for data movement, the control connection will sit idle. If the transfer takes longer than 15 minutes, the masquerading host will close the control connection. The data connection will go to completion, but you will have to reconnect if you want to get more files. This is controlled by the definitions:

#define MASQUERADE_EXPIRE_TCP 15*60*HZ
#define MASQUERADE_EXPIRE_TCP_FIN 2*60*HZ
#define MASQUERADE_EXPIRE_UDP 5*60*HZ

in the file /usr/include/linux/ip_fw.h. Six hours (360 minutes) seems to be a relatively acceptable timeout value, but change it as you see fit.

Problems
Not all protocols work with IP masquerading. ICMP messages (such as those used by ping) will not be passed through the masquerading host. Also, application protocols that pass their address to the receiving host will not work. The talk program is an example of this.

A major exception to the applications that don't work is ftp. The IP masquerading software has been written to handle file transfers as of kernel version 1.3.37. FTP clients, under normal operation, will send the server the address and port number to which the server should connect for a transfer. This shouldn't work with masquerading for the same reasons that talk fails. However, the IP masquerading software will intercept the FTP PORT command and masquerade as the client host awaiting for the server to connect to it.

The biggest problem is the most subtle one: IP fragmentation. Fragmentation occurs automatically within the Internet Protocol. IP always wants to fit a datagram in the frame size of the network link it is transmitting over. Most data links define a Maximum Transmission Unit (MTU) of information that will fit within one frame. If the IP datagram to be sent out can't all fit into the MTU size of the frame, it will be fragmented.

An IP datagram carrying a TCP segment is structured like the ``Original Datagram'' illustration in Figure 3. After fragmentation, the new datagrams appear (also shown in Figure 3). The most important aspect to notice is the placement of the TCP header. With fragmentation, it only appears in the first fragment and not in succeeding ones. Without the header, the host doing the masquerading has no way of determining whether the fragment should be forwarded. The same applies for fragmented UDP packets.

With TCP, this problem is mostly avoided because of TCP's MSS (Maximum Segment Size) negotiation. That's not to say it won't happen, but it doesn't occur most of the time. UDP, however, is much more susceptible to this type of behavior. Your only solution as an administrator is to be careful about controlling MTU sizes on SLIP or PPP networks.

Other problems also exist for X applications (connections back to the X server); RealAudio (patches available, however); and rlogin (rlogind requires a privileged port).

Real World Problem
Actual troubleshooting of masquerading problems is not always as easy as getting the rules straight. One subscriber to the IP masquerading mailing list (see Sidebar) presented an interesting problem. It was solved with simple analysis, code knowledge, and a good hex editor.

The Problem
Greg Priem sent a message to the IP masquerading mailing list describing a problem in which his telnet sessions would freeze. He isolated a sequence of events that reproduced the problem--he would log into his service provider's main host from a machine behind his Linux box and type in ls -l.

Analysis
Greg did some initial analysis and posted what he found. The network he was using is illustrated in Figure 4. The telnets were from the Mac to the ISP and other hosts on the Internet. He noticed telnets from the Mac to the Linux Box worked fine, as well as telnets from the Linux Box to the ISP.

Output from tcpdump revealed fragmentation was taking place. I followed up with a message indicating a possible problem and asked Greg to check the MTU sizes on each interface of the Linux Box.

I thought it strange that fragmentation was occurring on a telnet session since telnet uses TCP. As mentioned before, when TCP opens a connection, the MSS negotiation is supposed to eliminate fragmentation.

Further debugging with tcpdump (a handy program) showed the MTU assigned by the ISP was 212. To try to eliminate fragmentation, the SLIP link was also assigned an MTU of 212 by Greg. When looking at the MSS negotiation of the connections, Greg found that from the Linux box to the ISP, the MSS was set to 172, and from the Mac to the Linux box it was the same. However, a connection from the Mac to the ISP showed an MSS of 536.

The Solution
Given that information, I was able to deduce the problem and respond with an appropriate solution.

The connection scenarios are given in Figure 5.

One thing to note was the MSS advertisement of 536 from the Mac when it had an immediate link with an MTU smaller than that. BSD-experienced people will remember this number from the networking code that chose an MSS value for TCP's negotiation by seeing if the destination was on the local LAN or a remote LAN. The code roughly looked like:

if dest_net == local_net
then
mss = (link MTU) - 40
else
mss = 536
/* determined by 576 - 40 */
fi

If the destination was on a remote network, it would set the MSS automatically to 536. This was a good number because the RFC for IP stated that the default datagram size for internetworking is 576, meaning every device should be able to handle it without further fragmentation. Forty is subtracted to allow for IP and TCP headers.

A second thing to notice was the Linux box forwarding the MSS advertisement. One might think that since a connection is being made from the Linux box as a consequence of masquerading, the MSS value would be based on the network link from the Linux box and not the original value from the sending host.

As an aside, there was the one unexplainable instance of connections made to the ISP host and the ISP sending back an MSS of 1460, as shown at the bottom of Figure 5. It's strange because it was also connected to the PPP link with an MTU of 212. This may be attributed to a lack of knowledge on the ISP's side of the network.

Since both sides were using an MSS value greater than the MTU of either link, there was bound to be fragmentation, even for a TCP connection. Under normal circumstances, this wouldn't matter, but it does confuse masquerading.

The simple solution was to have the ISP support an MTU of at least 576 and for Greg to set the SLIP link with an MTU of 576 or greater. Therefore, no fragmentation would occur.

Greg e-mailed his ISP and waited for an answer. When none arrived he became impatient. Since he didn't have the source to the TCP code on the Mac, the only way to look at it was with a hex editor. He started poking around to see if he could find the BSD-like code where it made the decision for the MSS, and sure enough, he found it. He changed the hard coded values of 536 to 172 (i.e. 212-40), restarted his Mac, and lo and behold, it worked--no more fragmentation! (By the way, the ISP did change the MTU size later.) His approach was a little more daring than what I would have done, but it seems to be the nature of Linux users to patch an existing binary if they can't recompile something.

Conclusions
IP masquerading is an interesting technology, but more importantly, it serves a very useful function for many Internet environments. It works well for common services such as telnet, http, and ftp, but it does not support everything. ICMP messages, talk, remote X applications, and rlogin do not work with masquerading. Fortunately, the software is still in its Alpha versions, and more development is being pursued.

Developing P2P Protocols across NAT

Hole punching is a possible solution to solving the NAT problem for P2P protocols.
Network address translators (NATs) are something every software engineer has heard of, not to mention networking professionals. NAT has become as ubiquitous as the Cisco router in networking terms.

Fundamentally, a NAT device allows multiple machines to communicate with the Internet using a single globally unique IP address, effectively solving the scarce IPv4 address space problem. Though not a long-term solution, as originally envisaged in 1994, for better or worse, NAT technology is here to stay, even when IPv6 addresses become common. This is partly because IPv6 has to coexist with IPv4, and one of the ways to achieve that is by using NAT technology.

This article is not so much a description of how a NAT works. There already is an excellent article on this subject by Geoff Huston (see the on-line Resources). It is quite comprehensive, though plenty of other resources are available on the Internet as well.

This article discusses a possible solution to solving the NAT problem for P2P protocols.

What Is Wrong with NAT?
NAT breaks the Internet more than it makes it. I may sound harsh here, but ask any peer-to-peer application developer, especially the VoIP folks, and they will tell you why.

For instance, you never can do Web hosting behind a NAT device. At least, not without sufficient tweaking. Not only that, you cannot run any service such as FTP or rsync or any public service through a NAT device. This can be solved by obtaining a globally unique IP address and configuring the NAT device to bypass traffic originating from that particular IP.

But, the particularly hairy issue with NATed IP addresses is that you can't access machines behind a NAT, simply because you won't even know that a NAT exists in between. By and large, NAT is designed to be transparent, and it remains so. Even if you know there is a NAT device, NAT will let traffic reach the appropriate private IP only if there is mapping between the private IP/TCP or UDP port number with the NAT's public IP/TCP or UDP port number. And, this mapping is created only when traffic originates from the private IP to the Internet-not vice versa.

To make things more complicated, NAT simply drops all unsolicited traffic coming from the Internet to the private hosts. Though this feature arguably adds a certain degree of security through obscurity, it creates more problems than it solves, at least from the perspective of the future of the Internet.

At least 50% of the most commonly used networking applications use peer-to-peer technology. Common examples include instant messaging protocols, VoIP applications, such as Skype, and the BitTorrent download accelerator. In fact, peer-to-peer traffic is only going to increase as time progresses, because the Internet has a lot more to offer beyond the traditional client/server paradigm.

Peer-to-peer technology, by definition, is a mesh network as opposed to a star network in a client/server model. In a peer-to-peer network, all nodes act simultaneously as client and server. This already leads to programming complexity, and peer-to-peer nodes also have to deal somehow with the problematic NAT devices in between.

To make things even more difficult for P2P application developers, there is no standardized NAT behavior. Different NAT devices behave differently. But, the silver lining is that a large portion of the NAT devices in existence today still behave sensibly enough at least to let peer-to-peer UDP traffic pass through.

Sending TCP traffic across a NAT device also has met with success, though you may not be as lucky as with UDP. In this article, we focus purely on UDP, because TCP NAT traversal still remains rather tricky. UDP NAT traversal also is not completely reliable across all NAT devices, but things are very encouraging now and will continue to get better as NAT vendors wake up to the need for supporting P2P protocols.

Incidentally, voice traffic is better handled by UDP, so that suits us fine. Now that we have a fairly good idea of the problem we are trying to solve, let's get down to the solution.

Anatomy of the Solution
The key to the NAT puzzle lies in the fact that in order for machines behind a NAT gateway to interact with the public Internet, NAT devices necessarily have to allow inbound traffic-that is, replies to requests originating from behind the NAT device. In other words, NAT devices let traffic through to a particular host behind a NAT device, provided the traffic is indeed a reply to a request sent by the NAT device. Now, as mentioned above, NAT devices vary widely in operation, and they let through replies coming from other hosts and port numbers, depending on their own notion of what a reply means.

Our job is simple if we understand this much-that instead of connecting directly to the host behind NAT, we somehow need to mimic a scenario in which the target host originates a connection to us and then we connect to it as though we are responding to the request. In other words, our connection request to the target host should seem like a reply to the NAT device.

It turns out that this technique is easy to achieve using a method now widely known as UDP hole punching. Contrary to what the name suggests, this does not leave a gaping security hole or anything of the sort; it is simply a perfectly sensible and effective way to solve the NAT problem for peer-to-peer protocols.

In a nutshell, what UDP hole punching does already has been explained. Now if it were only that, life would be too simple, and you would not be reading this article. As it turns out, there are plenty of obstacles on the way, but none of them are too complicated.

First is the issue of how to get the private host to originate traffic so we can send our connection request to it masquerading as a reply. To make things worse, NAT devices also have an idle timer, typically of around 60 seconds, such that they stop waiting for replies once a request originates and no reply comes within 60 seconds. So, it is not enough that the private host originate traffic, but also we have to act fast-we have to send the "reply" before the NAT device removes the "association" with the private host, which will frustrate our connection attempt.

Now, a reply obviously has to come from the original machine to which the request was sent. This suits us fine if we are not behind another NAT device. So, if we want to talk to a private IP, we make the private IP send a packet to us, and we send our connection request as a reply to it. But, how do we inform the private IP to send a packet to us when we want to talk to it?

If both the peer-to-peer hosts are behind different NAT devices, is it possible at all to communicate with each other? Fortunately, it is possible.

It turns out that NAT devices are somewhat forgiving, and they differ in their levels of leniency when it comes to interpreting what they consider as reply to a request. There are different varieties of NAT behavior:

Full cone NAT

Restricted cone NAT

Restricted port NAT

Symmetric NAT

I won't go into the details and definitions of these here, as there are numerous resources explaining them elsewhere. Symmetric NATs are the most formidable enemy for P2P applications. However, with a degree of cleverness, we can reasonably "guess" the symmetric NAT behavior and deal with it-well, not all symmetric NATs, but many of them can be tamed to allow P2P protocols.

First, how do we tell the private IP that we are interested in connecting to it at a particular instance?

Implementation Details of the UDP Hole Punching Technique
This problem can be solved by joining the problem, rather than fighting it head on. In order to achieve peer-to-peer traffic across NATs, we have to modify our P2P mesh model slightly to make it a hybrid of a traditional star model and modern mesh model.

So, we introduce the concept of a rendezvous server, or mediator server, which listens on a globally routable IP address. Almost all peer-to-peer protocols have traditionally relied on certain supernodes, or in other words, in P2P, all nodes are equal but some are more equal. Some nodes always have acted as key players in any P2P protocol. If you have heard of a BitTorrent tracker, you know what I mean.

A rendezvous concept is nothing new in the P2P world, nor is the star model totally done away with in P2P.

Coming back to our original NAT problem, private IPs obviously can browse the Internet through NAT devices, and thus they can talk HTTP through port 80 or through a proxy HTTP port over TCP. So private IPs can almost always open TCP connections to global IP addresses. We use this fact to make the private IP connect to a mediator or rendezvous server through TCP.

Our solution relies on the fact that all the P2P nodes are constantly in touch with a rendezvous server, listening on a global IP address through a persistent TCP connection. Remember that P2P nodes are both client and server at the same time, so they can originate connections as well as serve connection requests simultaneously.

It is through this TCP connection that we inform a particular P2P node that another node wants to talk to it. Then, the target node sends a request following which the peer sends the connection request as a response to the request.

Because the private machines behind a NAT device do not have a routable IP address, the only way for us to access them from outside the NAT device is through the mapping that the NAT device maintains for the machine to talk to the external world. For each connection originated from the private IP, a unique port is assigned at the NAT device. For us to talk to the private IP, we have to send our packets to that particular port assigned for the private IP's connection to the external world. Now, we know that there is no notion of connection in the UDP world, so NAT assumes that if a reply doesn't come for a UDP request in about 60 seconds, the connection is deemed non-existent and closed.

So now we have another problem-that of determining the port assigned at the NAT's public interface for the private IP connection. This can be inferred by inspecting the source address of the UDP datagram that reaches any global IP.

So far so good. If we are not behind NAT, we can use the previously mentioned technique to initiate communication with a private IP using the rendezvous server.

However, reality tells us that P2P peers are more likely to be behind a NAT than otherwise. So, this solution is not enough. We want to initiate a P2P connection from behind a NAT device ourselves. So, now we have two NAT devices in the picture, one behind each P2P node.

Now the real fun begins. First, let's redefine our goal in the light of this new twist to the problem and attack it step by step. What we want to do now is use the rendezvous server and inform the target P2P node to send us a request, but we are behind a NAT.

So, for any external party to talk to us, we should have a global IP/port combo that exists at the NAT public interface. First we have to create one for ourselves. Only then we can receive communication requests coming from outside the NAT network.

We can create a mapping for us by sending a packet to a global IP. The global IP can then figure out our mapping by inspecting the from address. But how do we inform our P2P node of this address? For that we can use the TCP connection with the rendezvous machine. But, only the global IP to which we send the packet knows our association, so how do we figure that out? It's simple. The global IP can send that information to us as a reply in the packet payload to us.

Assuming that we somehow obtain a public IP, port pair and figure that out, we tell the mediator that we are listening at that public IP/port pair and request the P2P target node to initiate a request to us. Subsequently, we can connect to it as a reply to that message.

But, then we cannot receive packets from the P2P target node, because NAT is not expecting a reply from that global IP. In fact, some NATs that show full cone behavior allow packets to come from any IP, but most NATs do not-back to square one.

Consider this: if both P2P nodes behind the NAT send packets to each other's public IP/port, the first packet from each party is discarded because it was unsolicited. But subsequent packets are let through because NAT thinks the packets are replies to our original request. And voilà the hole is punched, and UDP traffic can pass through directly between the P2P nodes.

Unfortunately, NATs also differ in their behavior of assigning public ports for different destination IPs. Most NAT devices fortunately do not change public ports between requests to different destination IPs, so we can safely assume that.

So first we send certain probe or discovery packets to two different IPs and figure out the behavior of the NAT. If it is found to be consistent, our approach will work. In the unlikely case that we bump into symmetric NAT behavior that varies the port between requests, we can figure out the delta by which the port number varies. And, using this we can guess the port assigned for a particular request.

The reason we are so particular about this is because the first packet to our P2P destination behind NAT is dropped by NAT. So, all we can do is guess. In practice, however, it works fairly well. This is why it is important that the P2P nodes keep the source and the destination ports the same for communication.

Once this hole punching procedure is performed, the two P2P nodes can communicate with each other without the help of the rendezvous machine. So the rendezvous machine is useful only for informing a P2P node about an incoming connection and informing each of the communicating peers about each other's public addresses. Subsequently, the communication happens directly without the intervention of the rendezvous server.

Now we have to apply some ingenuity and introduce appropriate headers in the packets to inform the peer whether it is sending a reply meant for the P2P client or whether it is sending a request meant for the P2P server. Once we are able to differentiate between the two, we are set. We also need to differentiate between hole punching traffic and regular traffic, because hole punching traffic needs to be bounced, and regular traffic needs to be processed.

Of course, if we stop sending and receiving, the association at the NAT device at both ends will expire. So we either can send keepalive traffic or rerun the hole punching technique. You can choose whichever technique is suitable depending upon your needs.

This technique will not work if both the P2P nodes are behind the same NAT device. So, we also have to figure out whether we can communicate directly using the private IP address itself. Thus, our hole punching has to try the private interface along with the peer's public interface. And, it can happen that our private network has the same private IP as the peer's private IP. So we have to guard against getting spurious responses.

It also can happen that another P2P node in the same private network as ours has the same private IP as the P2P node we want to talk to in another private network. Then we have to do additional validation against the peer's identity to make sure we really are talking to the interested node.

In the unlikely case that you run into brain-damaged NAT devices at both ends, this technique obviously will fail, because we should be able to predict the public address assigned to us. In that situation, the only way is to make the rendezvous server act as a relay for the traffic. So peer-to-peer traffic goes through, but it is no longer peer to peer with the rendezvous machine acting as server. If you run into such situations, you need to think of implementing that as well.

Now, for the Real Dope, the C Code for Achieving the above
Due to their long length, the listings for this article are located on the Linux Journal FTP site at ftp.ssc.com/pub/lj/listings/issue148/9004.tgz. If you need more information on implementing your own hole punching library, you always can refer to the above design constraints and design a solution appropriately.

Please note that I have consciously left out the rfcs and NAT discovery techniques, such as STUN and frameworks like ICE. UDP hole punching is already complicated, and we don't gain anything by making it even more bloated without adding any real value. So, the technique as it stands works as good or even better than other NAT traversal mechanisms.

First, take a look at the rendezvous code (Listing 1). Note that we use select() to serve multiple sockets. We could as well use kqueue() on *BSD, or better, use the libevent abstraction (see Resources). But, I stuck to select() because performance doesn't matter so much to us. We talk to the mediator server only for establishing peer-to-peer connections, not otherwise.

The hole punching implementation is given in Listing 2 and the P2P client in Listing 3.

Using this method, you should be able to develop your own peer-to-peer protocol. You easily can develop your own instant messaging protocol along with some GUI code. You can transfer files either using nc or using code for that directly. You can develop certain applications, such as transferring voice via a microphone and speaker. In other words, you can develop a hobby VoIP application with this.

Several possibilities exist. You can add some reliability on top of UDP in case you are paranoid about your data reaching you safely.

One very useful tool that helped me immensely in this endeavor is the Network Swiss-Army knife, netcat.

You can see hole punching in action by using this simple command. At each end, type:

$ nc -u -p 17000 17000

With only the peer public IP different, you can start communicating if you are lucky, because most NAT devices try to assign the same private port as the public port.

If you want to test TCP hole punching, try this:

$nc -l -p 17000

at one end and this:

$nc -p 17000 17000

at the other end.

Future Work
Rather than having one rendezvous server, you can have a few of them for failover and geographical distribution. However, if you are behind two levels of NAT, sometimes this may not work. You also could listen on multiple virtual and real interfaces and attempt hole punching on all of them. You can add TCP hole punching on similar lines and try that first, and then attempt UDP hole punching.

I am only collecting the good work of people at a single place ... I have made no cotribution towards developmnt of this article

A Stream Socket API for C++

Use of standard socket libraries can be a little intimidating knowing what to call, when to call it, how to set the right parameters and which ones to leave alone. This interface for c++ provides a simpler interface to use. If you think about it, even the reading from the stream works like standard character streams and will block when data is not available.
This library and example is provided by Rob Tougher in an article from the Linux Gazette. His permission has been granted to use this library for class purposes. If you use it otherwise, it is up to you to gain permission for use in your application.

--------------------------------------------------------------------------------

Server
Socket.cpp provides the wrapper API for the basic calls used in standard c examples.
The public procedures available are:
--------------------------------------------------------------------------------

public:
Socket();
virtual ~Socket();

// Server initialization
bool create();
bool bind ( const int port );
bool listen() const;
bool accept ( Socket& ) const;

// Client initialization
bool connect ( const std::string host, const int port );

// Data Transimission
bool send ( const std::string ) const;
int recv ( std::string& ) const;

void set_non_blocking ( const bool );

bool is_valid() const { return m_sock != -1; }

--------------------------------------------------------------------------------
The Socket class is defined to be used to create the ServerSocket class. The ServerSocket class basically adds exception handling and provides a stream interface.
An example of the use of the ServerSocket is shown below. The example simply echos whatever is sent to it.

--------------------------------------------------------------------------------

int main ( int argc, int argv[] )
{
std::cout << "running....\n";
try
{
// Create the listening socket
ServerSocket server ( 30000 );

while ( true )
{ //create the conversational socket
ServerSocket new_sock;
// wait for a client connection
server.accept ( new_sock );

try
{
while ( true )
{ // read the string and write it back
std::string data;
new_sock >> data;
new_sock << data;
}
}
catch ( SocketException& ) {}

}
}
catch ( SocketException& e )
{
std::cout << "Exception was caught:" << e.description() << "\nExiting.\n";
}

return 0;
}
}

Sockets API

Sockets Tutorial

This is a simple tutorial on using sockets for interprocess
communication.

The client server model

Most interprocess communication uses the client server
model. These terms refer to the two processes which
will be communicating with each other. One of the
two processes, the client, connects to the other process, the
server,
typically to make a request for information. A good analogy is
a person who makes a phone call to another person.

Notice that the client needs to know of the existence of and
the address of the server, but the server does not need to
know the address of (or even the existence of) the client prior
to the connection being established. Notice also that once
a connection is established, both sides can send and receive
information.

The system calls for establishing a connection are somewhat
different for the client and the server, but both involve
the basic construct of a socket. A socket is one end of
an interprocess communication channel. The two processes
each establish their own socket.

The steps involved in establishing a socket on the client
side are as follows:

Create a socket with the socket() system call
Connect the socket to the address of the server using the
connect() system call
Send and receive data. There are a number of ways to do this,
but the simplest is to use the read() and write() system calls.

The steps involved in establishing a socket on the
server side are as follows:

Create a socket with the socket() system call
Bind the socket to an address using the bind() system call.
For a server socket on the Internet, an address consists of a
port number on the host machine.
Listen for connections with the listen() system call
Accept a connection with the accept() system call.
This call typically blocks until a client connects with the server.
Send and receive data

Socket Types

When a socket is created, the program has to specify the
address domain and the socket type. Two processes
can communicate with each other only if their sockets are of the same
type and in the same domain.

There are two widely used address domains, the unix domain, in which
two processes which share a common file system communicate, and the
Internet domain, in which two processes running on any two hosts on
the Internet communicate. Each of these has its own address format.

The address of a socket in
the Unix domain is a character string which is basically an entry in
the file system.

The address of a socket in the Internet domain
consists of the Internet address of the host machine (every computer
on the Internet has a unique 32 bit address, often referred to as its
IP address). In addition, each socket needs a port number on that
host. Port numbers are 16 bit unsigned integers. The lower numbers
are reserved in Unix for standard services. For example, the
port number for the FTP server is 21. It is important that standard
services be at the same port on all computers so that clients will
know their addresses. However, port numbers above 2000 are
generally available.

There are two widely used socket types, stream sockets, and
datagram sockets. Stream sockets treat communications as a
continuous stream of characters, while datagram sockets have to read
entire messages at once. Each uses its own communciations protocol.
Stream sockets use TCP (Transmission Control Protocol), which is a
reliable, stream oriented protocol, and datagram sockets use UDP (Unix
Datagram Protocol), which is unreliable and message oriented.

The examples in this tutorial will use sockets in the Internet domain
using the TCP protocol.

Sample code

C code for a very simple client and server are provided for you.
These communicate using stream sockets in the Internet domain. The
code is described in detail below. However, before you read the
descriptions and look at the code, you should compile and run the two
programs to see what they do.

Click here for
the server program

Click here for
the client program

Download these into files called server.c and client.c and compile them separately into two
executables called server and client.
They probably won't require any special compiling flags, but on
some solaris systems you may need to link to the socket library
by appending -lsocket to your compile command.

Ideally, you should run the client and the server on separate hosts on
the Internet. Start the server first. Suppose the server is running
on a machine called cheerios. When you run the server,
you need to pass the port number in as an argument. You can choose
any number between 2000 and 65535. If this port is already in use on
that machine, the server will tell you this and exit. If this
happens, just choose another port and try again. If the port is
available, the server will block until it receives a connection
from the client. Don't be alarmed if the server doesn't do anything;
it's not supposed to do anything until a connection is made.
Here is a typical command line:


server 51717

To run the client you need to pass in two arguments, the name of the
host on which the server is running and the port number on which the
server is listening for connections.
Here is the command line to connect to the server described above:


client cheerios 51717

The client will prompt you
to enter a message. If everything works correctly, the server will
display your message on stdout, send an acknowledgement message to
the client and terminate. The client will print the acknowledgement
message from the server and then terminate.

You can simulate this on a single machine by running the server in one
window and the client in another. In this case, you can use the keyword
localhost as the first argument to the client.

Server code

The server code uses a number of ugly programming constructs, and so
we will go through it line by line.


#include <stdio.h>

This header file contains declarations used in most input and
output and is typically included in all C programs.


#include <sys/types.h>

This header file contains definitions of a number of data types
used in system calls. These types are used in the next two
include files.


#include <sys/socket.h>

The header file socket.h includes
a number of definitions of structures needed for sockets.


#include <netinet/in.h>

The header file netinet/in.h contains
constants and structures needed for internet domain addresses.


void error(char *msg)
{
    perror(msg);
    exit(1);
}

This function is called when a system call fails.
It displays a message about the error on stderr and
then aborts the program. The
perror man page gives more information.


int main(int argc, char *argv[])
{
     int sockfd, newsockfd, portno, clilen, n;

sockfd and newsockfd are file
descriptors, i.e. array subscripts into the
file descriptor table . These two variables
store the values returned by the socket system call and the accept
system call.

portno stores the port number on which the server
accepts connections.

clilen stores the size of the address of the client.
This is needed for the accept system call.

n is the return value for the read()
and write() calls; i.e. it contains the number of
characters read or written.


     char buffer[256];

The server reads characters from the socket connection into this buffer.


     struct sockaddr_in serv_addr, cli_addr;

A sockaddr_in is a structure containing an internet
address. This structure is defined in <netinet/in.h>.
Here is the definition:


struct sockaddr_in {
     short   sin_family; /* must be AF_INET */
     u_short sin_port;
     struct  in_addr sin_addr; 
     char    sin_zero[8]; /* Not used, must be zero */
};

An in_addr structure, defined in the same header file,
contains only one field, a unsigned long called s_addr.

The variable serv_addr will contain the address of
the server, and cli_addr will contain the address of
the client which connects to the server.


     if (argc < 2) {
         fprintf(stderr,"ERROR, no port provided\n");
         exit(1);
     }

The user needs to pass in the port number on which the server will
accept connections as an argument. This code displays an error
message if the user fails to do this.


     sockfd = socket(AF_INET, SOCK_STREAM, 0); 
     if (sockfd < 0) 
         error("ERROR opening socket");

The socket() system call creates a new socket. It takes
three arguments. The first is the address domain of the socket.
Recall that there are two possible address domains, the unix domain
for two processes which share a common file system, and the Internet
domain for any two hosts on the Internet. The symbol constant
AF_UNIX is used for the former, and AF_INET
for the latter (there are actually many other options which can be
used here for specialized purposes).

The second argument is the type of socket. Recall that there
are two choices here, a stream socket in which characters are read
in a continuous stream as if from a file or pipe, and a
datagram socket, in which messages are read in chunks.
The two symbolic constants are SOCK_STREAM and
SOCK_DGRAM.

The third argument is the protocol. If this argument is zero
(and it always should be except for unusual circumstances), the
operating system will choose the most appropriate protocol.
It will choose TCP for stream sockets and UDP for datagram sockets.

The socket system call returns an entry into the file descriptor
table (i.e. a small integer). This value is used for all subsequent
references to this socket. If the socket call fails, it returns -1.
In this case the program displays and error message and exits. However,
this system call is unlikely to fail.

This is a simplified description of the socket call; there are
numerous other choices for domains and types, but these are the most
common. The socket() man page
has more information.


     bzero((char *) &serv_addr, sizeof(serv_addr));

The function bzero() sets all values in a buffer to zero.
It takes two arguments, the first is a pointer to the buffer and the
second is the size of the buffer. Thus, this line initializes
serv_addr to zeros.


     portno = atoi(argv[1]);

The port number on which the server will listen for connections
is passed in as an argument, and this statement uses the


atoi()

function to convert this from a string of digits
to an integer.


     serv_addr.sin_family = AF_INET;

The variable serv_addr is a structure of type
struct sockaddr_in. This structure has four fields.
The first field is short sin_family, which contains a
code for the address family. It should always be set to the
symbolic constant AF_INET.


     serv_addr.sin_port = htons(portno);

The second field of serv_addr is

unsigned short sin_port

, which contain the port number. However, instead of simply
copying the port number to this field, it is necessary to convert
this to network byte order using the function
htons() which converts a port number in host byte order
to a port number in network byte order.


     serv_addr.sin_addr.s_addr = INADDR_ANY;

The third field of sockaddr_in is a structure of
type struct in_addr which contains only a single field
unsigned long s_addr. This field contains the IP address
of the host. For server code, this will always be the IP address of
the machine on which the server is running, and there is a symbolic
constant INADDR_ANY which gets this address.


     if (bind(sockfd, (struct sockaddr *) &serv_addr,
              sizeof(serv_addr)) < 0)
                  error("ERROR on binding");

The bind() system call binds a socket to an
address, in this case the address of the current host and
port number on which the server will run. It takes three
arguments, the socket file descriptor, the address to which
is bound, and the size of the address to which it is bound.
The second argument is a pointer to a structure of type
sockaddr, but what is passed in is a structure
of type sockaddr_in, and so this must be cast to
the correct type. This can fail for a number of reasons, the
most obvious being that this socket is already in use on this
machine. The bind() man page
has more information.


     listen(sockfd,5);

The listen system call allows the process to listen on
the socket for connections. The first argument is the socket
file descriptor, and the second is the size of the backlog queue,
i.e., the number of connections that can be waiting while the process
is handling a particular connection. This should be set to 5,
the maximum size permitted by most systems. If the first argument
is a valid socket, this call cannot fail, and so the code doesn't
check for errors. The listen()
man page has more information.


     clilen = sizeof(cli_addr);
     newsockfd = accept(sockfd, (struct sockaddr *) &cli_addr, &clilen);
     if (newsockfd < 0) 
          error("ERROR on accept");

The accept() system call causes the process to
block until a client connects to the server. Thus, it wakes up
the process when a connection from a client has been successfully
established. It returns a new file descriptor, and all communication
on this connection should be done using the new file descriptor.
The second argument is a reference pointer to the address of the
client on the other end of the connection, and the third argument is
the size of this structure. The


accept()

man page has more information.


     bzero(buffer,256);
     n = read(newsockfd,buffer,255);
     if (n < 0) error("ERROR reading from socket");
     printf("Here is the message: %s\n",buffer);

Note that we would only get to this point after a client has
successfully connected to our server. This code initializes the
buffer using the bzero() function, and then reads from
the socket. Note that the read call uses the new file descriptor, the
one returned by accept(), not the original file descriptor
returned by socket(). Note also that the

read()

will block until there is something for it to read in the
socket, i.e. after the client has executed a write().
It will read either the total number of characters in the socket or
255, whichever is less, and return the number of characters read.
The read() man page has
more information.


     n = write(newsockfd,"I got your message",18);
     if (n < 0) error("ERROR writing to socket");

Once a connection has been established, both ends can both read and
write to the connection. Naturally, everything written by the client
will be read by the server, and everything written by the server
will be read by the client. This code simply writes a short
message to the client. The last argument of write is the size of the
message. The write() man page
has more information.


     return 0; 
}

This terminates main and thus the program. Since main was declared to
be of type int as specified by the ascii standard, some compilers
complain if it does not return anything.

Client code

As before, we will go through the program client.c line
by line.


#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

The header files are the same as for the server with one addition.
The file netdb.h defines the structure


hostent

, which will be used below.


void error(char *msg)
{
    perror(msg);
    exit(0);
}

int main(int argc, char *argv[])
{
    int sockfd, portno, n;
    struct sockaddr_in serv_addr;
    struct hostent *server;

The error() function is identical to that in the server,
as are the variables sockfd, portno, and n.
The variable serv_addr will contain the address of the
server to which we want to connect. It is of type
struct sockaddr_in.

The variable server is a pointer to a structure of
type hostent. This structure is defined in the header
file netdb.h as follows:


struct  hostent {
        char    *h_name;        /* official name of host */
        char    **h_aliases;    /* alias list */
        int     h_addrtype;     /* host address type */
        int     h_length;       /* length of address */
        char    **h_addr_list;  /* list of addresses from name server */
#define h_addr  h_addr_list[0]  /* address, for backward compatiblity */
};

It defines a host computer on the Internet.
The members of this structure are:


h_name       Official name of the host.

h_aliases    A zero  terminated  array  of  alternate
             names for the host.

h_addrtype   The  type  of  address  being  returned;
             currently always AF_INET.

h_length     The length, in bytes, of the address.

h_addr_list  A pointer to a list of network addresses
             for the named host.  Host addresses are
             returned in network byte order.

Note that h_addr is an alias for the first
address in the array of network addresses.


    char buffer[256];
    if (argc < 3) {
       fprintf(stderr,"usage %s hostname port\n", argv[0]);
       exit(0);
    }
    portno = atoi(argv[2]);
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) 
        error("ERROR opening socket");

All of this code is the same as that in the server.


    server = gethostbyname(argv[1]);
    if (server == NULL) {
        fprintf(stderr,"ERROR, no such host\n");
        exit(0);
    }

argv[1] contains the name of a host on the Internet,
e.g. cheerios@cs.rpi.edu. The function:


     struct hostent *gethostbyname(char *name)

Takes such a name as an argument and returns a pointer to a
hostent containing information about that host.
The field char *h_addr contains the IP address.
If this structure is NULL, the system could not locate
a host with this name.

In the old days, this function worked by searching a system file
called /etc/hosts but with the explosive growth
of the Internet, it became impossible for system administrators
to keep this file current. Thus,
the mechanism by which this function works is complex,
often involves querying large databases all around the country.
The gethostbyname() man
page has more information.


    bzero((char *) &serv_addr, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    bcopy((char *)server->h_addr, 
         (char *)&serv_addr.sin_addr.s_addr,
         server->h_length);
    serv_addr.sin_port = htons(portno);

This code sets the fields in serv_addr. Much of
it is the same as in the server. However, because the
field server->h_addr is a character string,
we use the function:


void bcopy(char *s1, char *s2, int length)

which copies length bytes from s1 to
s2.


    if (connect(sockfd,&serv_addr,sizeof(serv_addr)) < 0) 
        error("ERROR connecting");

The connect function is called by the client
to establish a connection to the server. It takes three
arguments, the socket file descriptor, the address of the
host to which it wants to connect (including the port number), and
the size of this address. This function returns 0 on success
and -1 if it fails. The

connect()

man page has more information.

Notice that the client needs to
know the port number of the server, but it does not need to know
its own port number. This is typically assigned by the system when
connect is called.


    printf("Please enter the message: ");
    bzero(buffer,256);
    fgets(buffer,255,stdin);
    n = write(sockfd,buffer,strlen(buffer));
    if (n < 0) 
         error("ERROR writing to socket");
    bzero(buffer,256);
    n = read(sockfd,buffer,255);
    if (n < 0) 
         error("ERROR reading from socket");
    printf("%s\n",buffer);
    return 0;
}

The remaining code should be fairly clear. It prompts the
user to enter a message, uses fgets to read the
message from stdin, writes the message to the socket, reads
the reply from the socket, and displays this reply on the screen.

Enhancements to the server code

The sample server code above has the limitation that it only handles
one connection, and then dies. A "real world" server should run
indefinitely and should have the capability of handling a number of
simultaneous connections, each in its own process. This is typically
done by forking off a new process to handle each new connection.

The
following code has a dummy function called dostuff(int sockfd).
This function will handle the connection after it has been established
and provide whatever services the client requests. As we saw above,
once a connection is established, both ends can use read
and write to send information to the other end, and the
details of the information passed back and forth do not concern us here.
To write a "real world" server, you would make essentially no changes
to the main() function, and all of the code which provided the service would
be in dostuff().

To allow the server to handle multiple simultaneous connections,
we make the following changes to the code:

Put the accept statement and the following code in an infinite loop.
After a connection is established, call fork() to
create a new process.
The child process will close sockfd and call
dostuff, passing the new socket file descriptor
as an argument. When the two processes have completed their
conversation, as indicated by dostuff() returning,
this process simply exits.
The parent process closes newsockfd. Because
all of this code is in an infinite loop, it will return to the
accept statement to wait for the next connection.

Here is the code.


 while (1) {
     newsockfd = accept(sockfd, 
           (struct sockaddr *) &cli_addr, &clilen);
     if (newsockfd < 0) 
         error("ERROR on accept");
     pid = fork();
     if (pid < 0)
         error("ERROR on fork");
     if (pid == 0)  {
         close(sockfd);
         dostuff(newsockfd);
         exit(0);
     }
     else close(newsockfd);
 } /* end of while */

Click here for
a complete server program which includes this change. This
will run with the program client.c.

The zombie problem

The above code has a problem; if the parent runs for a long time and
accepts many connections, each of these connections will create a
zombie when the connection is terminated. A zombie is a process which
has terminated but but cannot be permitted to fully die because at
some point in the future, the parent of the process might execute a
wait and would want information about the death of the
child. Zombies clog up the process table in the kernel, and so they
should be prevented. Unfortunately, the code which prevents zombies
is not consistent across different architectures. When a child dies,
it sends a SIGCHLD signal to its parent. On systems such as AIX, the
following code in main() is all that is needed.


     signal(SIGCHLD,SIG_IGN);

This says to ignore the SIGCHLD signal. However, on systems running
SunOS, you have to use the following code:


void *SigCatcher(int n)
{
    wait3(NULL,WNOHANG,NULL);    
}
...
int main()
{
   ...
   signal(SIGCHLD,SigCatcher);
   ...

The function SigCatcher() will be called whenever the
parent receives a SIGCHLD signal (i.e. whenever a child dies). This
will in turn call wait3 which will receive the signal.
The WNOHANG flag is set, which causes this to be a non-blocking wait
(one of my favorite
oxymorons).

Alternative types of sockets

This example showed a stream socket in the Internet domain. This is the
most common type of connection. A second type of connection is a
datagram socket. You might want to use a datagram socket in cases
where there is only one message being sent from the client to the
server, and only one message being sent back.
There are several differences between a datagram socket and
a stream socket.

Datagrams are unreliable,
which means that if a packet of information gets lost somewhere in the
Internet, the sender is not told (and of course the receiver does not
know about the existence of the message). In contrast, with a stream socket,
the underlying TCP protocol will detect that a message was lost because
it was not acknowledged, and it will be retransmitted without
the process at either end knowing about this.
Message boundaries are preserved in datagram sockets.
If the sender sends a datagram of 100
bytes, the receiver must read all 100 bytes at once. This can be
contrasted with a stream socket, where if the sender wrote a 100
byte message, the receiver could read it in two chunks of 50 bytes
or 100 chunks of one byte.
The communication is done using special system calls sendto() and receivefrom() rather than the more
generic read() and write().
There is a lot less overhead associated with a datagram socket
because connections do not need to be established and broken down,
and packets do not need to be acknowledged. This is why datagram
sockets are often used when the service to be provided is short,
such as a time-of-day service.

Click here for
the server code using a datagram socket.

Click here for
the client code using a datagram socket.

These two programs can be compiled and run in exactly the same way
as the server and client using a stream socket.

Server code with a datagram socket

Most of the server code is similar to the stream socket code. Here
are the differences.


   sock=socket(AF_INET, SOCK_DGRAM, 0);

Note that when the socket is created, the second argument is
the symbolic constant SOCK_DGRAM instead of SOCK_STREAM. The
protocol will be UDP, not TCP.


   fromlen = sizeof(struct sockaddr_in);
   while (1) {
       n = recvfrom(sock,buf,1024,0,(struct sockaddr *)&from,&fromlen);
       if (n < 0) error("recvfrom");

Servers using datagram sockets do not use the listen() or
the accept() system calls. After a socket has been bound
to an address, the program calls recvfrom() to read a
message. This call will block until a message is received. The
recvfrom() system call takes six arguments. The first
three are the same as those for the read() call, the
socket file descriptor, the buffer into which the message will be
read, and the maximum number of bytes. The fourth argument is an
integer argument for flags. This is ordinarily set to zero. The
fifth argument is a pointer to a
sockaddr_in structure. When the call returns, the
values of this structure will have been filled in for the other end of
the connection (the client). The size of this structure will be in
the last argument, a pointer to an integer. This call returns the
number of bytes in the message. (or -1 on an error condition). The href="recvfrom.txt"> recvfrom() man page has more
information.


       n = sendto(sock,"Got your message\n",17,
                  0,(struct sockaddr *) &from,fromlen);
       if (n  < 0) error("sendto");
   }
 }

To send a datagram, the function sendto() is used.
This also takes six arguments. The first three are the same as for
a write() call, the socket file descriptor, the
buffer from which the message will be written, and the number of
bytes to write. The fourth argument is an int argument called
flags, which is normally zero. The fifth argument is a pointer
to a sockadd_in structure. This will contain the
address to which the message will be sent. Notice that in this
case, since the server is replying to a message, the values of this
structure were provided by the recvfrom call. The last argument is
the size of this structure. Note that this is not a pointer to an
int, but an int value itself. The


sendto()

man page has more information.

The Client Code

The client code for a datagram socket client is the same as that for
a stream socket with the following differences.

the socket system call has SOCK_DGRAM instead of SOCK_STREAM as
its second argument.
there is no connect() system call
instead of read and write, the
client uses recvfrom and sendto which
are described in detail above.

Sockets in the Unix Domain

Here is the code for a client and server which communicate using
a stream socket in the Unix domain.

Click here for the server program

Click here for the client program

The only difference between a socket in the Unix domain and a socket
in the Internet domain is the form of the address. Here is the
address structure for a Unix Domain address, defined in
the header file sys/un.h.


struct sockaddr_un 
        short sun_family; /* AF_UNIX */
 char sun_path[108]; /* path name (gag) */
};

The field sun_path has the form of a path name in
the Unix file system. This means that both client and server
have to be running the same file system. Note that on systems running
AFS, such as the Rensselaer Computer System, these sockets must
be created in the directory /tmp. Once a socket
has been created, it remain until it is explicitly deleted, and
its name will appear with the ls command, always with
a size of zero. Sockets in the Unix domain are virtually identical
to named pipes (FIFOs).

Designing servers

There are a number of different ways to design servers. These
models are discussed in detail in a book by Douglas E. Comer and
David L. Stevens entiteld Internetworking with TCP/IP Volume
III:Client Server Programming and Applications published by
Prentice Hall in 1996.
These are summarized here.

Concurrent, connection oriented servers
The typical server in the Internet domain creates a stream socket and
forks off a process to handle each new connection that it receives.
This model is appropriate for services which will do a good deal
of reading and writing over an extended period of time, such as a
telnet server or an ftp server. This model has relatively high
overhead, because forking off a new process is a time consuming
operation, and because a stream socket which uses the TCP protocol has
high kernel overhead, not only in establishing the connection
but also in transmitting information. However, once the
connection has been established, data transmission is reliable in
both directions.

Iterative, connectionless servers
Servers which provide only a single message to the client often
do not involve forking, and often use a datagram socket rather than
a stream socket. Examples include a finger daemon or a timeofday server
or an echo server (a server which merely echoes a message sent by
the client). These servers handle each message as it receives
them in the same process. There is much less overhead with this
type of server, but the communication is unreliable. A request or
a reply may get lost in the Internet, and there is no built-in
mechanism to detect and handle this.

Single Process concurrent servers A server which needs the
capability of handling several clients simultaneous, but where each
connection is I/O dominated (i.e. the server spends most of its time
blocked waiting for a message from the client) is a candidate for a
single process, concurrent server. In this model, one process
maintains a number of open connections, and listens at each for a
message. Whenever it gets a message from a client, it replies quickly
and then listens for the next one. This type of service can be done
with the select system call.

I m really sory for broken links .... actually i don prefer to write evrything and do some copy paste as well ..

To view sample codes visit
http://cs.baylor.edu/~donahoo/practical/CSockets/textcode.html

Ch-1 Networking Basics

Computers running on the Internet communicate to each other using either the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP), as this diagram illustrates:

When you write Java programs that communicate over the network, you are programming at the application layer. Typically, you don't need to concern yourself with the TCP and UDP layers. Instead, you can use the classes in the java.net package. These classes provide system-independent network communication. However, to decide which Java classes your programs should use, you do need to understand how TCP and UDP differ.
TCP
When two applications want to communicate to each other reliably, they establish a connection and send data back and forth over that connection. This is analogous to making a telephone call. If you want to speak to Aunt Beatrice in Kentucky, a connection is established when you dial her phone number and she answers. You send data back and forth over the connection by speaking to one another over the phone lines. Like the phone company, TCP guarantees that data sent from one end of the connection actually gets to the other end and in the same order it was sent. Otherwise, an error is reported.
TCP provides a point-to-point channel for applications that require reliable communications. The Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), and Telnet are all examples of applications that require a reliable communication channel. The order in which the data is sent and received over the network is critical to the success of these applications. When HTTP is used to read from a URL, the data must be received in the order in which it was sent. Otherwise, you end up with a jumbled HTML file, a corrupt zip file, or some other invalid information.

--------------------------------------------------------------------------------
Definition: TCP (Transmission Control Protocol) is a connection-based protocol that provides a reliable flow of data between two computers.
--------------------------------------------------------------------------------

UDP
The UDP protocol provides for communication that is not guaranteed between two applications on the network. UDP is not connection-based like TCP. Rather, it sends independent packets of data, called datagrams, from one application to another. Sending datagrams is much like sending a letter through the postal service: The order of delivery is not important and is not guaranteed, and each message is independent of any other.

--------------------------------------------------------------------------------
Definition: UDP (User Datagram Protocol) is a protocol that sends independent packets of data, called datagrams, from one computer to another with no guarantees about arrival. UDP is not connection-based like TCP.
--------------------------------------------------------------------------------

For many applications, the guarantee of reliability is critical to the success of the transfer of information from one end of the connection to the other. However, other forms of communication don't require such strict standards. In fact, they may be slowed down by the extra overhead or the reliable connection may invalidate the service altogether.

Consider, for example, a clock server that sends the current time to its client when requested to do so. If the client misses a packet, it doesn't really make sense to resend it because the time will be incorrect when the client receives it on the second try. If the client makes two requests and receives packets from the server out of order, it doesn't really matter because the client can figure out that the packets are out of order and make another request. The reliability of TCP is unnecessary in this instance because it causes performance degradation and may hinder the usefulness of the service.

Another example of a service that doesn't need the guarantee of a reliable channel is the ping command. The purpose of the ping command is to test the communication between two programs over the network. In fact, ping needs to know about dropped or out-of-order packets to determine how good or bad the connection is. A reliable channel would invalidate this service altogether.

The UDP protocol provides for communication that is not guaranteed between two applications on the network. UDP is not connection-based like TCP. Rather, it sends independent packets of data from one application to another. Sending datagrams is much like sending a letter through the mail service: The order of delivery is not important and is not guaranteed, and each message is independent of any others.

--------------------------------------------------------------------------------
Note: Many firewalls and routers have been configured not to allow UDP packets. If you're having trouble connecting to a service outside your firewall, or if clients are having trouble connecting to your service, ask your system administrator if UDP is permitted.
--------------------------------------------------------------------------------

Understanding Ports
Generally speaking, a computer has a single physical connection to the network. All data destined for a particular computer arrives through that connection. However, the data may be intended for different applications running on the computer. So how does the computer know to which application to forward the data? Through the use of ports.
Data transmitted over the Internet is accompanied by addressing information that identifies the computer and the port for which it is destined. The computer is identified by its 32-bit IP address, which IP uses to deliver data to the right computer on the network. Ports are identified by a 16-bit number, which TCP and UDP use to deliver the data to the right application.

In connection-based communication such as TCP, a server application binds a socket to a specific port number. This has the effect of registering the server with the system to receive all data destined for that port.

--------------------------------------------------------------------------------
Definition: The TCP and UDP protocols use ports to map incoming data to a particular process running on a computer.
--------------------------------------------------------------------------------

In datagram-based communication such as UDP, the datagram packet contains the port number of its destination and UDP routes the packet to the appropriate application,

Port numbers range from 0 to 65,535 because ports are represented by 16-bit numbers. The port numbers ranging from 0 - 1023 are restricted; they are reserved for use by well-known services such as HTTP and FTP and other system services. These ports are called well-known ports. Your applications should not attempt to bind to them.

Naresh Kumar a.k.a tEqUiL@

Friday, October 3, 2008

have u ever wondered ..

Thursday, February 8, 2007

How to Port Linux on Arm 9

Thursday, January 4, 2007

IP Masquerading with Linux

Developing P2P Protocols across NAT

A Stream Socket API for C++

Sockets API

The client server model

Socket Types

Sample code

Server code

Client code

Enhancements to the server code

The zombie problem

Alternative types of sockets

Server code with a datagram socket

The Client Code

Sockets in the Unix Domain

Designing servers

Ch-1 Networking Basics

Thats mee ...

About Me

Blog Archive

Linux Rocks