Draft Andre Adrian Document: draft-conference-01.txt DFS Deutsche Flugsicherung Category: Experimental november 23th, 2004 Expires: ? Voice over Internet Unicast telephone conference Status of this Memo This document specifies a telephone conference implementation for hands-free Voice over Internet telephony and requests discussion and suggestions for improvements. Distribution of this memo is unlimited. Copyright Notice Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. You are allowed to use this source code in any open source or closed source software you want. You are allowed to use the algorithms for a hardware solution. You are allowed to modify the source code. You are not allowed to remove the name of the author from this memo or from the source code files. You are not allowed to monopolize the source code or the algorithms behind the source code as your intellectual property. This source code is free of royalty and comes with no warranty. Abstract This memo describes a VoIP telephone conference algorithm based on build-in Multi Conferencing Unit with unicast messages. The one-time transmission time is evaluated. An implementation in C++ for the Linux operation system is discussed. Introduction The RTP protocol (RFC 3550) describes two possibilities to perform a telephone conference. First possibility is using multicast UDP messages. Second possibility is to use unicast UDP from every node to every other node (3 participants need 3 unicast links, 4 participants need 6 unicast links, l = n*(n-1)/2). This memo describes another possibility to do telephone conference with unicast messages. For 3 participants we need 2 unicast links, for 4 participants we need 3 links, that is l = n-1. This solution is called "build-in Multi Conferencing Unit with unicast messages" in ITU-T H.323 documentation. As always you have to pay a price. In our case there are 2 disadvantages: First, the audio mixing is done in one node only. If this node quits the telephone conference, the telephone conference is terminated. Second, speaking from one leaf-node to another leaf-node needs 2 audio compression/decompression actions. Both disadvantages are typical for MCU (multi conferencing unit) solutions. Audio mixing The core of every telephone conference is the audio mixing. The audio mixer is a device with N audio inputs and N audio outputs. Normally the signal on output i is the sum of all input signals without the signal from input i. In Fig. 1 every node has a 3-input/3-output audio mixer. The connections between A and B and between B and C are the unicast links mentioned above. Node B is the central node, nodes A and C are leaf nodes. If user A speaks in his microphone, the audio mixer A will forward this audio signal to output 2. Output 2 of mixer A is connected to input 3 of mixer B. Output 2 of mixer B is the sum of input 1 (user B talking) and input 3 (user A talking). This output 2 of mixer B is connected to input 2 of mixer C. Output 1 of mixer C is the speaker - and finally user C hears user A talking. +-----------+ |out3 in3| | | +--------<--|out2 A in2|--<--------+ | | | | | Spk--<--|out1 in1|--<--Mic | | +-----------+ | | | | +-----------+ | +-------->--|in3 out3|-->--------+ | | +-------->--|in2 B out2|-->--------+ | | | | | Mic-->--|in1 out1|-->--Spk | | +-----------+ | | | | +-----------+ | | |out3 in3| | | | | | +--------<--|out2 C in2|--<--------+ | | Spk--<--|out1 in1|--<--Mic +-----------+ Figure 1: Central Node B with Leaf Nodes A and C Audio mixing of audio frames (packets) In a Voice-over-IP system we have audio frames. These audio frames contain audio samples for 20 milliseconds or so and have to come in every 20 milliseconds to produce a constant audio feed to the speaker. The audio mixer above is a pure sample-by-sample mixer and does not know about audio frames. The handling of audio frames is done with circular buffers. These buffers are located at the < and > positions in Fig. 1. We need the buffers also because frames have different size. The audio frames of the OSS (Open Sound System) are limited to powers of 2, like 4ms, 8ms, 16ms. The audio frames on the network contain 20ms, 30ms or 80ms of speech as defined in the codec documentation or in RFC3551. To have minimum latency the software uses 4ms OSS frames for 20ms network frames. For 40ms network frames it is possible to use 8ms OSS frames, for 80ms network frames we can use 16ms OSS frames. The creating of frames is driven by the microphone. The OSS system will create a software interrupt for every microphone frame. Note: Because the reason for this software interrupt is a hardware interrupt, the jitter of this interrupt is less then the jitter of a operation system timer interrupt - at least for Linux. With an actual microphone audio frame the audio mixer can perform the mixing. The other network inputs are used as available - there is no waiting. The mixer creates a microphone frame size output. This output is added to the output circular buffers. If an individual output buffer has enough data to fill a network frame, the codec compression is done and a network packet is send. Latency of build-in Multi-Conferencing with unicast messages Everybody knows that a 20 millisecond network audio frame will give you an one-way transmission time of at least 20 milliseconds. Most people will further agree that RTT (round trip time) divided by 2 is an (good enough) estimate for the transport delay. This section will investigate the impact of circular buffers on the real one-way transmission time. First we look at a normal telephone call. The microphone buffer needs 4ms (or 8ms, 16ms). The codec needs 20ms. Together we have 24ms in the transmitting node. The receiving node uses a speaker buffer that is equal to the microphone buffer. Together we have 2 times audio buffer, the codec frame time and the transport time in the network. This calculation is quite different to the ITU-T calculation in G.114. For G.114 the audio buffer time is equal to the codec frame time. Now we look at a telephone conference. The one-way transmission time from and to the central node are equal to the normal telephone call case. The leaf-node to leaf-node one-way transmission time is worse. Because the central node has to read a network audio frame, mix the contents and write another network audio frame, the calculation is 2 times audio buffer, 2 times the codec frame time and 2 times the transport time in the network. Latency of Intercom conference call To limit one-way transmission time the "everybody talks/listens with everybody else" approach of a telephone conference call can be reduced to an intercom conference call. In intercom conference call the leaf-nodes do not talk/listen to each other, there is only talk/listen from and to the central node. If users are satisfied with this limited conference call the one-way transmission time is again the number we calculated for normal telephone call. Software design The application is split into three processes. The application start skript intercom, the audio and network process intercomd, written in C++, and the graphical user interface intercom.tcl, written in Tcl/Tk. Note: The libraries for graphical user interfaces often make it impossible to use the select() command. Therefore the application was split into a realtime process intercomd and a GUI process intercom.tcl. The start skript intercom does set up the hardware audio mixer for hardware acoustic echo cancellation. Then it starts the intercomd process and gives this process realtime priority. At last, the GUI process is started. The audio network packets use the iLBC codec to save network bandwidth. The software sends every 80ms a network packet. In the network packet there are 4 iLBC frames each 20ms long. This is not good for packet loss handling, but good for network bandwidth - with all headers (IPv4, UDP, RTP) we have 21kBit/s for one audio link. With a network packet every 20ms the bandwidth is 37kBit/s. The iLBC codec itself needs 15.2kBit/s. Interprocess communication The audio and network process intercomd listens as TCP server on port 4999. The GUI process intercom.tcl operates as TCP client. Both processes are normally running on the same computer and can communicate via localhost (IP address 127.0.0.1). It is possible to run both processes on different computers. To transport audio network packets the Real Time Protocol (RFC3550) is used with IPv4 unicast UDP network packets. Program options The application does understand the following options: -a Value set ambient background noise to dezibel value. For optimum acoustic echo cancellation the background noise level should be given. The application does measure the noise level every 5 seconds if there is a connection to another intercom and writes an "Ambient =" message to standard error. -l use hardware AEC and Line-in connector for the microphone. This option was tested with Sennheiser microphone capsule ME34, Sennheiser gooseneck MZH3040 and Behringer microphone amplifier MIC100. -m use hardware AEC and Mic-in connector for microphone. This option was tested with Labtec Mic 333. -p Portnumber set RTP Portnumber. Default is 5000. All audio communication is done on this port number. Source and destination port number are equal. -t do telephone conference call. Everybody can listen/talk to everybody else. Default is intercom conference call - see above. Program configuration The program assumes that there is only one intercom user for every IP address. The GUI process handles the Name-to-IP translation with a fixed table. The Name is the text on the direct access button. See the intercom.tcl source code for details of the - very primitive - translation. You will have to change the intercom.tcl file for your IP addresses! Appendix The following appendices contain the source code. The soucre code was compiled and tested on some IA32 computers (Intel Celeron, Intel Centrino, AMD Athlon, AMD Athlon XP). Operating System was Linux Kernel 2.6 (SuSE distribution 9.1) with GCC 3.3.3, Tcl/Tk 8.4, OSS 3.8.2 and ALSA 1.0.3. Next to the compile tools the RPM packages rtstools and alsa are needed. Further you need the iLBC codec sources from http://www.ietf.org/internet-drafts/draft-ietf-avt-ilbc-codec-05.txt Compile cd ilbc make cd .. c++ -O2 -o intercomd aec.cpp cirbuf.cpp oss.cpp rtp.cpp tcp.cpp \ udp.cpp intercomd.cpp ilbc/ilbc.a -lm Run ./intercom Note: Read shell script intercom to get an idea what is going on. For a first test talk to yourself by clicking the button for your own IP-address, e.g. EDDF TEC2 if your computer is 192.168.1.2. The button should become green. Attention: A short click (less then 300ms) is a toggle switch, a long click is a push-to-talk. /****************** APPENDIX intercom ******************/ #!/bin/bash # intercom # # Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. # # Voice-over-IP Intercom start skript for Linux with ALSA # # Version 1.0 echo "usage: intercom [OPTIONS] [Partner1-IP [Partner2-IP ...]]" echo " -a value set ambient (background) noise to value dezibel" echo " -l use hardware AEC and Line-in for microphone" echo " -m use hardware AEC and Mic-in for microphone" echo " -p Number RTP Portnumber (default is 5000)" echo " -t Telephone conference call (everybody with everybody)" echo "" # delete old process killall -9 intercomd 2>/dev/null # configure audio mixer (AC97 compatible) # check your mixer hardware with: # cat /dev/sndstat # # hardware AEC test successful with on-board sound mixers: # Analog Devices AD1985, ICEnsemble ICE1232, Realtek ALC650 and ALC655, # SigmaTel STAC9750/51 # # hardware AEC test successful with PCI sound cards: # Soundblaster PCI128 # # hardware AEC test failed with PCI sound cards: # Soundblaster Audigy 2, C-Media 8738 # set playback volume amixer -q sset 'PCM',0 70% amixer -q sset 'Master',0 70% # use only PCM for playback amixer -q set 'Master',0 unmute amixer -q set 'PCM',0 unmute amixer -q set 'Mic',0 mute amixer -q set 'Line',0 mute amixer -q set 'CD',0 mute amixer -q set 'Aux',0 mute # enable recording amixer -q cset iface=MIXER,name='Capture Switch' 1 # handle options for argv in $* ;do # echo $argv case $argv in ("-l") # for Hardware AEC and Line-In Capture amixer -q set 'Capture',0 0%-,100%- amixer -q cset iface=MIXER,name='Capture Source' 4,5 ;; ("-m") # for Hardware AEC and Mic-In Capture amixer -q set 'Capture',0 0%-,100%- amixer -q cset iface=MIXER,name='Capture Source' 0,5 ;; esac done # start audio/network daemon ./intercomd $* & sleep 1 # give audio/network daemon realtime process prio sudo /usr/sbin/setpriority `pidof intercomd` fifo 1 # To make sudo work without password add as superuser with the program # visudo: # %users ALL=(root) NOPASSWD: /usr/sbin/setpriority # start graphical user interface ./intercom.tcl $* & /****************** APPENDIX intercomd.h ******************/ /* intercomd.h * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Voice over IP Intercom with Telephone conference and Acoustic Echo * Cancellation using unicast RTP messages (RFC3550) * * Version 1.0 */ #ifndef _INTERCOMD_H #define ERROR (-1) #define OKAY 0 #define NO 0 #define YES 1 /* Emit program info and abort the program if expr is false with errno */ #define assert_errno(expr) \ if(!(expr)) { \ fprintf(stderr, "voipconf: %s:%d: %s: Assertion '%s' failed. errno=%s\n", \ __FILE__, __LINE__, __PRETTY_FUNCTION__, __STRING(expr), strerror(errno)); \ exit(1); \ } /* Emit program info and return function if expr is true with retvalue */ #define return_if(expr, retvalue) \ if(expr) { \ fprintf(stderr, "voipconf: %s:%d: %s: Check '%s' failed.\n", \ __FILE__, __LINE__, __PRETTY_FUNCTION__, __STRING(expr)); \ return(retvalue); \ } int print_gui(const char *fmt, ...); #define _INTERCOMD_H #endif /****************** APPENDIX intercomd.cpp ******************/ /* intercomd.cpp * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Voice over IP Intercom with Telephone conference and Acoustic Echo * Cancellation using unicast RTP messages (RFC3550) * * Attention. This source code is not very portable! You need: * iLBC Codec Sourcecode draft-ietf-avt-ilbc-codec-05.txt * same endian for CPU and soundcard for 16bit audio sample * Open Source Sound (OSS) support * ALSA Sound support for hardware (2-channel) AEC * * Compile Sourcecode: c++ -O2 -o intercomd aec.cpp cirbuf.cpp oss.cpp rtp.cpp tcp.cpp udp.cpp \ intercomd.cpp ilbc/ilbc.a -lm * * Format Sourcecode: indent -kr -i2 -nlp -ci2 -l72 -lc72 -nut voipconf.cpp * * To be done: * Sometimes click noise with telephone conference after adding 3. node * Packet loss concealment handling * Better Jitter buffer handling * open/close audio io on demand * * Version 1.0 */ #include #include #include #include #include /* Error handling */ #include #include extern int errno; /* low level io */ #include #include #include #include #include #include #include /* Socket io */ #include #include #include /* iLBC codec */ #include "ilbc/iLBC_define.h" #include "ilbc/iLBC_encode.h" #include "ilbc/iLBC_decode.h" #include "rtp.h" #include "udp.h" #include "tcp.h" #include "aec.h" #include "oss.h" #include "cirbuf.h" #include "intercomd.h" /* Design Constants */ #define PARTNERS 5 /* maximum telephony partners */ /* End of Design Constants */ #define FORMAT_RTP PT_iLBC #define ILBC_MODE 20 #define ILBCSIZE NO_OF_BYTES_20MS #define FRAMESIZE (20*8*2) /* compression frame size */ #define MTU 1460 /* Maximum Transfer Unit */ #define PACKETDURATION (FRAMES*20*8) /* audio */ static CIRBUF mic_cirbuf; static CIRBUF spk_cirbuf; static AEC aec; static int channels = 1; /* network transmitting to partners */ static in_addr_t to_ip[PARTNERS]; static RTP to_rtp[PARTNERS]; static UDP to_udp[PARTNERS]; static iLBC_Enc_Inst_t Enc_Inst[PARTNERS]; static CIRBUF conf_cirbuf[PARTNERS]; static char tx_buf[PARTNERS][MTU]; static char *tx_pbuf[PARTNERS]; static int tx_frames[PARTNERS]; static int to_partners = 0; static int telephone_conference = 0; /* network receiving from partners */ static in_addr_t from_ip[PARTNERS]; static unsigned long from_ssrc[PARTNERS]; static int from_cnt[PARTNERS]; static CIRBUF from_cirbuf[PARTNERS]; static iLBC_Dec_Inst_t Dec_Inst[PARTNERS]; static int from_partners = 0; /*----------------------------------------------------------------* * Encoder interface function *---------------------------------------------------------------*/ short encode( /* (o) Number of bytes encoded */ iLBC_Enc_Inst_t * iLBCenc_inst, /* (i/o) Encoder instance */ short *encoded_data, /* (o) The encoded bytes */ short *data /* (i) The signal block to encode */ ) { float block[BLOCKL_MAX]; int k; /* convert signal to float */ for (k = 0; k < iLBCenc_inst->blockl; k++) block[k] = (float) data[k]; /* do the actual encoding */ iLBC_encode((unsigned char *) encoded_data, block, iLBCenc_inst); return (iLBCenc_inst->no_of_bytes); } int tx_buf_init(int i) { tx_pbuf[i] = tx_buf[i]; tx_frames[i] = 0; } int audio_read(int audio_fd) { /* fat software interrupt routine: read audio, send UDP packets */ short mic_buf[FRAGSIZE / 2]; if (1 == channels) { size_t len = read(audio_fd, mic_buf, FRAGSIZE); return_if(len != FRAGSIZE, ERROR); if (0 == to_partners) { /* start assembling send packets only if we have a target */ return OKAY; } short spk_buf[FRAGSIZE / 2]; spk_cirbuf.pop((char *) spk_buf, FRAGSIZE); /* Acoustic Echo Cancellation - using software buffers */ int i; for (i = 0; i < FRAGSIZE / 2; ++i) { mic_buf[i] = aec.doAEC(mic_buf[i], spk_buf[i]); } } else { short mic2_buf[FRAGSIZE]; size_t len = read(audio_fd, mic2_buf, 2 * FRAGSIZE); return_if(len != 2 * FRAGSIZE, ERROR); if (0 == to_partners) { /* start assembling send packets only if we have a target */ return OKAY; } /* Acoustic Echo Cancellation - using hardware audio mixer */ int i; for (i = 0; i < FRAGSIZE / 2; ++i) { mic_buf[i] = aec.doAEC(mic2_buf[2 * i], mic2_buf[2 * i + 1]); } } int ret = mic_cirbuf.push((char *) mic_buf, FRAGSIZE); if (ret < 0) { fprintf(stderr, "mic_cirbuf.push overrun\n"); } if (mic_cirbuf.getlen() >= FRAMESIZE) { /* My RFC3551 interpretation: Only one RTP Header for packets * with a number of frames */ int i; for (i = 0; i < PARTNERS; ++i) { if (to_ip[i] && 0 == tx_frames[i]) { /* put RTP header */ tx_pbuf[i] = RTP_network_copy(tx_pbuf[i], &to_rtp[i]); } } /* put payload (audio) */ short micbuf[FRAMESIZE / 2]; mic_cirbuf.pop((char *) micbuf, FRAMESIZE); if (telephone_conference) { /* telephone conference mix - everybody from/to everybody else */ short from_buf[PARTNERS][FRAMESIZE / sizeof(short)]; short sum_buf[FRAMESIZE / sizeof(short)]; int i, j, k; /* get audio from other partners */ for (i = 0; i < PARTNERS; ++i) { conf_cirbuf[i].pop((char *) from_buf[i], FRAMESIZE); } for (i = 0; i < PARTNERS; ++i) { if (to_ip[i]) { for (j = 0; j < FRAMESIZE / sizeof(short); ++j) { /* mix */ long sum = micbuf[j]; for (k = 0; k < PARTNERS; ++k) { if (to_ip[i] != from_ip[k]) { /* do not mix in origin */ sum += from_buf[k][j]; } } /* clip */ if (sum > 32767) { sum_buf[j] = 32767; } else if (sum < -32767) { sum_buf[j] = -32767; } else { sum_buf[j] = sum; } } /* do encoding (audio compression) */ short encoded_data[ILBCSIZE / 2]; int len = encode(&Enc_Inst[i], encoded_data, sum_buf); /* distribute to transmit buffers */ memcpy(tx_pbuf[i], encoded_data, len); tx_pbuf[i] += len; assert(tx_pbuf[i] - tx_buf[i] <= MTU); } } } else { /* intercom conference mixing - central node from/to other nodes */ /* do encoding (audio compression) */ short encoded_data[ILBCSIZE / 2]; int len = encode(&Enc_Inst[0], encoded_data, micbuf); /* distribute to transmit buffers */ int i; for (i = 0; i < PARTNERS; ++i) { if (to_ip[i]) { memcpy(tx_pbuf[i], encoded_data, len); tx_pbuf[i] += len; assert(tx_pbuf[i] - tx_buf[i] <= MTU); } } } /* transmit data packet(s) */ for (i = 0; i < PARTNERS; ++i) { if (to_ip[i] && ++tx_frames[i] >= FRAMES) { to_udp[i].send(tx_buf[i], tx_pbuf[i] - tx_buf[i]); /* prepare next go */ tx_buf_init(i); to_rtp[i].next(PACKETDURATION); } } } } int partner_timeout() { /* Delete old from_ssrc[] entries - this is not very quick! */ int i; for (i = 0; i < PARTNERS; ++i) { if (from_ssrc[i] && from_cnt[i] == 0) { char s[20]; print_gui("d %s\n", iptoa(s, from_ip[i])); from_ssrc[i] = 0; from_ip[i] = 0; --from_partners; } from_cnt[i] = 0; } } int partner_lookup(unsigned long ssrc, in_addr_t ip) { /* search */ int i; for (i = 0; i < PARTNERS; ++i) { if (from_ssrc[i] == ssrc) { ++from_cnt[i]; return i; /* old entry */ } } /* add new entry */ for (i = 0; i < PARTNERS; ++i) { if (0 == from_ssrc[i]) { if (0 == from_partners) { spk_cirbuf.init(); } from_ssrc[i] = ssrc; from_ip[i] = ip; from_cnt[i] = 1; initDecode(Dec_Inst + i, ILBC_MODE, 1); conf_cirbuf[i].init(); ++from_partners; char s[20]; print_gui("r %s\n", iptoa(s, ip)); return i; } } return ERROR; } void audio_write(int audio_fd) { int i, j; short from_buf[PARTNERS][FRAGSIZE / sizeof(short)]; short sum_buf[FRAGSIZE / sizeof(short)]; /* get audio */ for (i = 0; i < PARTNERS; ++i) { from_cirbuf[i].pop((char *) from_buf[i], FRAGSIZE); } for (j = 0; j < FRAGSIZE / sizeof(short); ++j) { /* mix */ long sum = 0; for (i = 0; i < PARTNERS; ++i) { sum += from_buf[i][j]; } /* clip */ if (sum > 32767) { sum_buf[j] = 32767; } else if (sum < -32767) { sum_buf[j] = -32767; } else { sum_buf[j] = sum; } } if (1 == channels) { if (from_partners > 0) { /* save for 1-channel AEC */ int ret = spk_cirbuf.push((char *) sum_buf, FRAGSIZE); if (ret < 0) { /* fprintf(stderr, "spk_cirbuf.push overrun\n"); */ } } write(audio_fd, sum_buf, FRAGSIZE); } else { short sum2_buf[FRAGSIZE]; int i; for (i = 0; i < FRAGSIZE / 2; ++i) { sum2_buf[2 * i] = 0; /* left channel silence */ sum2_buf[2 * i + 1] = sum_buf[i]; /* right channel spk */ } write(audio_fd, sum2_buf, 2 * FRAGSIZE); } } /*----------------------------------------------------------------* * Decoder interface function *---------------------------------------------------------------*/ short decode( /* (o) Number of decoded samples */ iLBC_Dec_Inst_t * iLBCdec_inst, /* (i/o) Decoder instance */ short *decoded_data, /* (o) Decoded signal block */ short *encoded_data, /* (i) Encoded bytes */ short mode /* (i) 0=PL, 1=Normal */ ) { int k; float decblock[BLOCKL_MAX], dtmp; /* check if mode is valid */ if (mode < 0 || mode > 1) { printf("\nERROR - Wrong mode - 0, 1 allowed\n"); exit(3); } /* do actual decoding of block */ iLBC_decode(decblock, (unsigned char *) encoded_data, iLBCdec_inst, mode); /* convert to short */ for (k = 0; k < iLBCdec_inst->blockl; k++) { dtmp = decblock[k]; if (dtmp < MIN_SAMPLE) dtmp = MIN_SAMPLE; else if (dtmp > MAX_SAMPLE) dtmp = MAX_SAMPLE; decoded_data[k] = (short) dtmp; } return (iLBCdec_inst->blockl); } static unsigned short rtp_port = 5000; int udp_read(int udp_fd) { /* software interrupt routine */ char buf[MTU]; struct sockaddr_in from_sock; socklen_t from_socklen = sizeof(sockaddr_in); RTP rtp_in; int len = recvfrom(udp_fd, buf, MTU, 0, (struct sockaddr *) &from_sock, &from_socklen); return_if(sizeof(RTP) + FRAMES * ILBCSIZE != len, ERROR); /* check Port number */ in_addr_t from_ip = ntohl(from_sock.sin_addr.s_addr); in_port_t from_port = ntohs(from_sock.sin_port); return_if(from_port != rtp_port, ERROR); char *pbuf = RTP_host_copy(&rtp_in, buf); int rc = rtp_in.check(FORMAT_RTP); return_if(rc, ERROR); int partner = partner_lookup(rtp_in.getssrc(), from_ip); return_if(partner < 0, ERROR); int i; for (i = 0; i < FRAMES; ++i, pbuf += ILBCSIZE) { /* do decoding (audio decompression) */ /* tbd.: Packet loss concealment */ short decoded_data[FRAMESIZE / 2]; len = decode(Dec_Inst + partner, decoded_data, (short *) pbuf, 1); return_if(len != FRAMESIZE / 2, ERROR); int ret = from_cirbuf[partner].push((char *) decoded_data, 2 * len); if (ret < 0) { fprintf(stderr, "from_cirbuf[%d].push overrun=%d\n", partner, ret / FRAGSIZE); } if (telephone_conference) { ret = conf_cirbuf[partner].push((char *) decoded_data, 2 * len); if (ret < 0) { /* fprintf(stderr, "conf_cirbuf[%d].push overrun=%d\n", partner, ret/FRAGSIZE); */ } } } return OKAY; } void command(char *cmd, int udp_fd) { /* delete special characters like \r, \n */ int i; for (i = 0; i < strlen(cmd); ++i) { if (cmd[i] < ' ') { /* hack: assume ASCII coding */ cmd[i] = 0; break; } } in_addr_t ip; switch (cmd[0]) { default: printf("voipconf commands:\n" "c IP-Adress - connect to IP-Adress\n" "h IP-Adress - hang-up IP-Adress\n\n"); fflush(stdout); break; case 'c': ip = atoip(cmd + 2); for (i = 0; i < PARTNERS; ++i) { if (0 == to_ip[i]) { if (0 == to_partners) { mic_cirbuf.init(); } tx_buf_init(i); to_ip[i] = ip; to_rtp[i].init(FORMAT_RTP); to_udp[i].send_init(cmd + 2, rtp_port, udp_fd); ++to_partners; break; } } break; case 'h': ip = atoip(cmd + 2); for (i = 0; i < PARTNERS; ++i) { if (ip == to_ip[i]) { to_ip[i] = 0; to_udp[i].send_close(); --to_partners; print_gui("%s\n", cmd); /* Tcl/Tk needs \n */ break; } } break; } /* fprintf(stderr, "cmd=%s to_partners=%d\n", cmd, to_partners); */ } #define CMDLEN 80 int gui_read(int gui_fd, int udp_fd) { char cmd[CMDLEN]; int len = read(gui_fd, cmd, CMDLEN); if (len <= 0) { fprintf(stderr, "gui_read() close\n"); int ret = shutdown(gui_fd, SHUT_RDWR); assert_errno(ret >= 0); return -1; } command(cmd, udp_fd); return gui_fd; } static int gui_fd = -1; int print_gui(const char *fmt, ...) { /* in fmt: Formatstring as printf */ /* in ...: Parameter(s) as printf */ if (gui_fd >= 0) { char s[MTU]; va_list ap; va_start(ap, fmt); (void) vsnprintf(s, MTU, fmt, ap); va_end(ap); int len = strlen(s); return write(gui_fd, s, len); } else { return ERROR; } } struct timeval difftimeval(struct timeval time1, struct timeval time2) { struct timeval diff; diff.tv_usec = time1.tv_usec - time2.tv_usec; if (diff.tv_usec < 0) { diff.tv_usec += 1000000; time2.tv_usec += 1; } diff.tv_sec = time1.tv_sec - time2.tv_sec; return diff; } float dB2q(float dB) { /* Dezibel to Ratio */ return powf(10.0f, dB / 20.0f); } float q2dB(float q) { /* Ratio to Dezibel */ return 20.0f * log10f(q); } /* program main loop. OS Event handler */ int loop(int audio_fd, int udp_fd, int gui_listen_fd) { struct timeval timeout; fd_set read_fds; int max_fd = 64; /* should be max(fd, ..) + 1 */ static struct timeval last_partner_timeout; static struct timeval last_getambient; struct timezone tz; gettimeofday(&last_partner_timeout, &tz); gettimeofday(&last_getambient, &tz); for (;;) { timeout.tv_sec = 0; timeout.tv_usec = FRAGTIME * 1000; FD_ZERO(&read_fds); FD_SET(audio_fd, &read_fds); FD_SET(udp_fd, &read_fds); FD_SET(gui_listen_fd, &read_fds); if (gui_fd >= 0) { FD_SET(gui_fd, &read_fds); } int ret = select(max_fd, &read_fds, NULL, NULL, &timeout); assert_errno(ret >= 0); if (FD_ISSET(audio_fd, &read_fds)) { /* audio_write(audio_fd); */ audio_read(audio_fd); } if (FD_ISSET(udp_fd, &read_fds)) { udp_read(udp_fd); } if (FD_ISSET(gui_listen_fd, &read_fds)) { gui_fd = tcp_server_init2(gui_listen_fd); } if (gui_fd >= 0) { if (FD_ISSET(gui_fd, &read_fds)) { gui_fd = gui_read(gui_fd, udp_fd); } } /* because of problems with Intel ICH5/Analog Devices AD1985 */ audio_write(audio_fd); struct timeval now, diff; gettimeofday(&now, &tz); diff = difftimeval(now, last_partner_timeout); if (diff.tv_usec >= 160000) { /* 2*PACKETDURATION in usec */ last_partner_timeout = now; partner_timeout(); } gettimeofday(&now, &tz); diff = difftimeval(now, last_getambient); if (diff.tv_sec >= 5) { last_getambient = now; if (to_partners > 0) { float ambient = aec.getambient(); float ambientdB = q2dB(ambient / 32767.0f); fprintf(stderr, "Ambient = %2.0f dB\n", ambientdB); } } } return ERROR; } int main(int argc, char *argv[]) { int i; for (i = 1; i < argc && '-' == argv[i][0]; ++i) { switch (argv[i][1]) { case 'a': /* set Ambient (No Talking) Noise level */ aec.setambient(MAXPCM*dB2q(atof(argv[++i]))); break; case 'l': /* use hardware AEC and Line-in for microphone */ channels = 2; break; case 'm': /* use hardware AEC and Mic-in for microphone */ channels = 2; break; case 'p': /* RTP Portnumber (default is 5000) */ rtp_port = atoi(argv[++i]); rtp_port &= 0xFFFE; /* RFC3550: RTP port has even port number */ break; case 't': /* Telephone conference call (true conference) */ telephone_conference = 1; break; } } /* open Audio Transmit and Receive */ int audio_fd = audio_init("/dev/dsp", channels); assert(audio_fd >= 0); /* open Network Receive */ int udp_fd = UDP_recv_init(rtp_port); assert(udp_fd >= 0); /* open Graphical User Interface as TCP server */ int gui_listen_fd = tcp_server_init(4999); /* iLBC codec Initialization */ int j; for (j = 0; j < PARTNERS; ++j) { initEncode(Enc_Inst + j, ILBC_MODE); initDecode(Dec_Inst + j, ILBC_MODE, 1); } /* open Network Transmit Partner (Connections) */ for (; i < argc; ++i) { if (0 == to_partners) { mic_cirbuf.init(); } tx_buf_init(to_partners); to_ip[to_partners] = atoip(argv[i]); to_rtp[to_partners].init(FORMAT_RTP); to_udp[to_partners].send_init(argv[i], rtp_port, udp_fd); ++to_partners; } loop(audio_fd, udp_fd, gui_listen_fd); return OKAY; } /****************** APPENDIX aec.h ******************/ /* aec.h * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Acoustic Echo Cancellation NLMS-pw algorithm * * Version 1.3 filter created with www.dsptutor.freeuk.com */ #ifndef _AEC_H /* include only once */ // use double if your CPU does software-emulation of float typedef float REAL; /* dB Values */ const REAL M0dB = 1.0f; const REAL M3dB = 0.71f; const REAL M6dB = 0.50f; const REAL M9dB = 0.35f; const REAL M12dB = 0.25f; const REAL M18dB = 0.125f; const REAL M24dB = 0.063f; /* dB values for 16bit PCM */ /* MxdB_PCM = 32767 * 10 ^(x / 20) */ const REAL M10dB_PCM = 10362.0f; const REAL M20dB_PCM = 3277.0f; const REAL M25dB_PCM = 1843.0f; const REAL M30dB_PCM = 1026.0f; const REAL M35dB_PCM = 583.0f; const REAL M40dB_PCM = 328.0f; const REAL M45dB_PCM = 184.0f; const REAL M50dB_PCM = 104.0f; const REAL M55dB_PCM = 58.0f; const REAL M60dB_PCM = 33.0f; const REAL M65dB_PCM = 18.0f; const REAL M70dB_PCM = 10.0f; const REAL M75dB_PCM = 6.0f; const REAL M80dB_PCM = 3.0f; const REAL M85dB_PCM = 2.0f; const REAL M90dB_PCM = 1.0f; const REAL MAXPCM = 32767.0f; /* Design constants (Change to fine tune the algorithms */ /* The following values are for hardware AEC and studio quality * microphone */ /* maximum NLMS filter length in taps. A longer filter length gives * better Echo Cancellation, but slower convergence speed and * needs more CPU power (Order of NLMS is linear) */ #define NLMS_LEN (80*8) /* convergence speed. Range: >0 to <1 (0.2 to 0.7). Larger values give * more AEC in lower frequencies, but less AEC in higher frequencies. */ const REAL Stepsize = 0.7f; /* minimum energy in xf. Range: M70dB_PCM to M50dB_PCM. Should be equal * to microphone ambient Noise level */ const REAL Min_xf = M75dB_PCM; /* Double Talk Detector Speaker/Microphone Threshold. Range <=1 * Large value (M0dB) is good for Single-Talk Echo cancellation, * small value (M12dB) is good for Doulbe-Talk AEC */ const REAL GeigelThreshold = M6dB; /* Double Talk Detector hangover in taps. Not relevant for Single-Talk * AEC */ const int Thold = 30 * 8; /* for Non Linear Processor. Range >0 to 1. Large value (M0dB) is good * for Double-Talk, small value (M12dB) is good for Single-Talk */ const REAL NLPAttenuation = M6dB; /* Below this line there are no more design constants */ /* Exponential Smoothing or IIR Infinite Impulse Response Filter */ class IIR_HP { REAL x; public: IIR_HP() { x = 0.0f; }; REAL highpass(REAL in) { const REAL a0 = 0.01f; /* controls Transfer Frequency */ /* Highpass = Signal - Lowpass. Lowpass = Exponential Smoothing */ x += a0 * (in - x); return in - x; }; }; /* 13 taps FIR Finite Impulse Response filter * Coefficients calculated with * www.dsptutor.freeuk.com/KaiserFilterDesign/KaiserFilterDesign.html */ class FIR_HP13 { REAL z[14]; public: FIR_HP13() { memset(this, 0, sizeof(FIR_HP13)); }; REAL highpass(REAL in) { const REAL a[14] = { // Kaiser Window FIR Filter, Filter type: High pass // Passband: 300.0 - 4000.0 Hz, Order: 12 // Transition band: 100.0 Hz, Stopband attenuation: 10.0 dB -0.043183226f, -0.046636667f, -0.049576525f, -0.051936015f, -0.053661242f, -0.054712527f, 0.82598513f, -0.054712527f, -0.053661242f, -0.051936015f, -0.049576525f, -0.046636667f, -0.043183226f, 0.0f }; memmove(z+1, z, 13*sizeof(REAL)); z[0] = in; REAL sum0 = 0.0, sum1 = 0.0; int j; for (j = 0; j < 14; j+= 2) { // optimize: partial loop unrolling sum0 += a[j] * z[j]; sum1 += a[j+1] * z[j+1]; } return sum0+sum1; } }; /* Recursive single pole IIR Infinite Impulse response filter * Coefficients calculated with * http://www.dsptutor.freeuk.com/IIRFilterDesign/IIRFiltDes102.html */ class IIR1 { REAL x, y; public: IIR1() { memset(this, 0, sizeof(IIR1)); }; REAL highpass(REAL in) { // Chebyshev IIR filter, Filter type: HP // Passband: 3700 - 4000.0 Hz // Passband ripple: 1.5 dB, Order: 1 const REAL a0 = 0.105831884f; const REAL a1 = -0.105831884; const REAL b1 = 0.78833646f; REAL out = a0 * in + a1 * x + b1 * y; x = in; y = out; return out; } }; /* Recursive two pole IIR Infinite Impulse Response filter * Coefficients calculated with * http://www.dsptutor.freeuk.com/IIRFilterDesign/IIRFiltDes102.html */ class IIR2 { REAL x[2], y[2]; public: IIR2() { memset(this, 0, sizeof(IIR2)); }; REAL highpass(REAL in) { // Butterworth IIR filter, Filter type: HP // Passband: 2000 - 4000.0 Hz, Order: 2 const REAL a[] = { 0.29289323f, -0.58578646f, 0.29289323f }; const REAL b[] = { 1.3007072E-16f, 0.17157288f }; REAL out = a[0] * in + a[1] * x[0] + a[2] * x[1] - b[0] * y[0] - b[1] * y[1]; x[1] = x[0]; x[0] = in; y[1] = y[0]; y[0] = out; return out; } }; // Extention in taps to reduce mem copies #define NLMS_EXT (10*8) // block size in taps to optimize DTD calculation #define DTD_LEN 16 class AEC { // Time domain Filters IIR_HP hp00, hp1; // DC-level remove Highpass) FIR_HP13 hp0; // 300Hz cut-off Highpass IIR1 Fx, Fe; // pre-whitening Highpass for x, e // Geigel DTD (Double Talk Detector) REAL max_max_x; // max(|x[0]|, .. |x[L-1]|) int hangover; // optimize: less calculations for max() REAL max_x[NLMS_LEN / DTD_LEN]; int dtdCnt; int dtdNdx; // NLMS-pw REAL x[NLMS_LEN + NLMS_EXT]; // tap delayed loudspeaker signal REAL xf[NLMS_LEN + NLMS_EXT]; // pre-whitening tap delayed signal REAL w[NLMS_LEN]; // tap weights int j; // optimize: less memory copies int lastupdate; // optimize: iterative dotp(x,x) double dotp_xf_xf; // double to avoid loss of precision REAL s0avg; public: AEC(); /* Geigel Double-Talk Detector * * in d: microphone sample (PCM as REALing point value) * in x: loudspeaker sample (PCM as REALing point value) * return: 0 for no talking, 1 for talking */ int dtd(REAL d, REAL x); /* Normalized Least Mean Square Algorithm pre-whitening (NLMS-pw) * The LMS algorithm was developed by Bernard Widrow * book: Widrow/Stearns, Adaptive Signal Processing, Prentice-Hall, 1985 * * in mic: microphone sample (PCM as REALing point value) * in spk: loudspeaker sample (PCM as REALing point value) * in update: 0 for convolve only, 1 for convolve and update * return: echo cancelled microphone sample */ REAL nlms_pw(REAL mic, REAL spk, int update); /* Acoustic Echo Cancellation and Suppression of one sample * in d: microphone signal with echo * in x: loudspeaker signal * return: echo cancelled microphone signal */ int AEC::doAEC(int d, int x); float AEC::getambient() { return s0avg; }; void AEC::setambient(float Min_xf) { dotp_xf_xf = NLMS_LEN * Min_xf * Min_xf; }; }; #define _AEC_H #endif /****************** APPENDIX aec.cpp ******************/ /* aec.cpp * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Acoustic Echo Cancellation NLMS-pw algorithm * * Version 1.3 filter created with www.dsptutor.freeuk.com */ #include #include #include #include #include "aec.h" /* Vector Dot Product */ REAL dotp(REAL a[], REAL b[]) { REAL sum0 = 0.0, sum1 = 0.0; int j; for (j = 0; j < NLMS_LEN; j+= 2) { // optimize: partial loop unrolling sum0 += a[j] * b[j]; sum1 += a[j+1] * b[j+1]; } return sum0+sum1; } AEC::AEC() { max_max_x = 0.0f; hangover = 0; memset(max_x, 0, sizeof(max_x)); dtdCnt = dtdNdx = 0; memset(x, 0, sizeof(x)); memset(xf, 0, sizeof(xf)); memset(w, 0, sizeof(w)); j = NLMS_EXT; lastupdate = 0; s0avg = M80dB_PCM; setambient(Min_xf); } REAL AEC::nlms_pw(REAL mic, REAL spk, int update) { REAL d = mic; // desired signal x[j] = spk; xf[j] = Fx.highpass(spk); // pre-whitening of x // calculate error value // (mic signal - estimated mic signal from spk signal) REAL e = d - dotp(w, x + j); REAL ef = Fe.highpass(e); // pre-whitening of e // optimize: iterative dotp(xf, xf) dotp_xf_xf += (xf[j]*xf[j] - xf[j+NLMS_LEN-1]*xf[j+NLMS_LEN-1]); if (update) { // calculate variable step size REAL mikro_ef = Stepsize * ef / dotp_xf_xf; // update tap weights (filter learning) int i; for (i = 0; i < NLMS_LEN; i += 2) { // optimize: partial loop unrolling w[i] += mikro_ef*xf[i+j]; w[i+1] += mikro_ef*xf[i+j+1]; } } if (--j < 0) { // optimize: decrease number of memory copies j = NLMS_EXT; memmove(x+j+1, x, (NLMS_LEN-1)*sizeof(REAL)); memmove(xf+j+1, xf, (NLMS_LEN-1)*sizeof(REAL)); } return e; } int AEC::dtd(REAL d, REAL x) { // optimized implementation of max(|x[0]|, |x[1]|, .., |x[L-1]|): // calculate max of block (DTD_LEN values) x = fabsf(x); if (x > max_x[dtdNdx]) { max_x[dtdNdx] = x; if (x > max_max_x) { max_max_x = x; } } if (++dtdCnt >= DTD_LEN) { dtdCnt = 0; // calculate max of max max_max_x = 0.0f; for (int i = 0; i < NLMS_LEN/DTD_LEN; ++i) { if (max_x[i] > max_max_x) { max_max_x = max_x[i]; } } // rotate Ndx if (++dtdNdx >= NLMS_LEN/DTD_LEN) dtdNdx = 0; max_x[dtdNdx] = 0.0f; } // The Geigel DTD algorithm with Hangover timer Thold if (fabsf(d) >= GeigelThreshold * max_max_x) { hangover = Thold; } if (hangover) --hangover; return (hangover > 0); } int AEC::doAEC(int d, int x) { REAL s0 = (REAL)d; REAL s1 = (REAL)x; // Mic Highpass Filter - to remove DC s0 = hp00.highpass(s0); // Mic Highpass Filter - telephone users are used to 300Hz cut-off s0 = hp0.highpass(s0); // ambient mic level estimation s0avg += 1e-4f*(fabsf(s0) - s0avg); // Spk Highpass Filter - to remove DC s1 = hp1.highpass(s1); // Double Talk Detector int update = !dtd(s0, s1); // Acoustic Echo Cancellation s0 = nlms_pw(s0, s1, update); // Acoustic Echo Suppression if (update) { // Non Linear Processor (NLP): attenuate low volumes s0 *= NLPAttenuation; } // Saturation if (s0 > MAXPCM) { return (int)MAXPCM; } else if (s0 < -MAXPCM) { return (int)-MAXPCM; } else { return (int)roundf(s0); } } /****************** APPENDIX cirbuf.h ******************/ /* cirbuf.h * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Circular Buffers * * Version 1.0 */ #ifndef _CIRBUF_H // must be multiple of FRAGSIZE and FRAMESIZE #define CIRBUFSIZE (2*80*8*2) /* circular buffer for FRAGSIZE to FRAMESIZE conversion with * overrun/underrun */ class CIRBUF { char buf[CIRBUFSIZE]; // must be multiple of FRAGSIZE and FRAMESIZE int in; int out; int len; public: CIRBUF(); void CIRBUF::init(); int CIRBUF::push(char *from, int size); int CIRBUF::pop(char *to, int size); int getlen() { return len; }}; #define _CIRBUF_H #endif /****************** APPENDIX cirbuf.cpp ******************/ /* cirbuf.cpp * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Circular buffers * * Version 1.0 */ #include #include #include "oss.h" #include "cirbuf.h" #include "intercomd.h" CIRBUF::CIRBUF() { bzero(buf, CIRBUFSIZE); in = out = len = 0; } void CIRBUF::init() { bzero(buf, CIRBUFSIZE); in = out = len = 0; } int CIRBUF::push(char *from, int size) { memcpy(buf + in, from, size); in += size; if (in >= CIRBUFSIZE) { in -= CIRBUFSIZE; } len += size; if (len > CIRBUFSIZE) { int oversize = (((len - CIRBUFSIZE) / FRAGSIZE)) * FRAGSIZE; if (oversize < len - CIRBUFSIZE) { oversize += FRAGSIZE; } // delete oldest if overrun out += oversize; if (out >= CIRBUFSIZE) { out -= CIRBUFSIZE; } len -= oversize; return -oversize; } else { return OKAY; } } int CIRBUF::pop(char *to, int size) { if (len < size) { // play out silence if underrun bzero(to, size); return ERROR; } memcpy(to, buf + out, size); out += size; if (out >= CIRBUFSIZE) { out -= CIRBUFSIZE; } len -= size; return OKAY; } /****************** APPENDIX oss.h ******************/ /* oss.h * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Open Sound System * * Version 1.0 */ #ifndef _OSS_H /* Design Constants */ #define AUDIOBUFS 2 // Soundcard buffers (minimum 2) #define FRAMES 4 // 1 to 4 (20ms to 80ms) /* End of Design Constants */ #define FORMAT_OSS AFMT_S16_LE /* Using fragment sizes shorter than 256 bytes is not recommended as the * default mode of application */ #if FRAMES==4 #define FRAGSIZELD 6 // 8 #elif FRAMES==2 #define FRAGSIZELD 7 #else #define FRAGSIZELD 6 #endif #define FRAGSIZE (1< #include #include #include /* Error handling */ #include #include extern int errno; /* low level io */ #include #include #include #include /* OSS, see www.4front-tech.com/pguide/index.html */ #include #include "oss.h" #include "intercomd.h" int audio_init(char *pathname, int channels_) { /* Using full duplex is simple in theory. The application just: * Opens the device. * Turns on full duplex * Sets fragment size if necessary * Sets number of channels, sample format and sampling rate * Starts reading and writing the device */ fprintf(stderr, "OSS Header SOUND_VERSION = %x\n", SOUND_VERSION); int audio_fd = open(pathname, O_RDWR); assert_errno(audio_fd >= 0); int sound_version = 0; ioctl(audio_fd, OSS_GETVERSION, &sound_version); fprintf(stderr, "OSS Driver SOUND_VERSION = %x\n", SOUND_VERSION); ioctl(audio_fd, SNDCTL_DSP_SETDUPLEX, 0); /* The 16 most significant bits (MMMM) determine maximum number of * fragments. By default the driver computes this based on available * buffer space. * The minimum value is 2 and the maximum depends on the situation. * Set MMMM=0x7fff if you don't want to limit the number of fragments */ // fragsize=2^FRAGSIZELD bytes int frag = AUDIOBUFS << 16 | FRAGSIZELD; if (2 == channels_) { ++frag; // double FRAGSIZE in stereo mode } int frag_ = frag; ioctl(audio_fd, SNDCTL_DSP_SETFRAGMENT, &frag); fprintf(stderr, "SETFRAGMENT=0x%x\n", frag); assert_errno(frag_ == frag); int format = FORMAT_OSS; ioctl(audio_fd, SNDCTL_DSP_SETFMT, &format); assert_errno(format == FORMAT_OSS); int channels = channels_; ioctl(audio_fd, SNDCTL_DSP_CHANNELS, &channels); assert_errno(channels_ == channels); fprintf(stderr, "SNDCTL_DSP_CHANNELS=%d\n", channels); int rate = 8000; ioctl(audio_fd, SNDCTL_DSP_SPEED, &rate); assert_errno(8000 == rate); fprintf(stderr, "SNDCTL_DSP_SPEED=%d\n", rate); fprintf(stderr, "\n"); return audio_fd; } /****************** APPENDIX rtp.h ******************/ /* rtp.h * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * Real Time Protocol Version 2 (RFC3550) * * Version 1.0 */ class RTP { /* Format in Host Byte order. Conversion with htonl() to Network Byte * order. This data structure is implementation dependent! * Tested with GCC and x86 */ unsigned long sequence:16; unsigned long payload_type:7; unsigned long marker:1; unsigned long csrc_count:4; unsigned long extension:1; unsigned long padding:1; unsigned long version:2; unsigned long timestamp; unsigned long ssrc; public: RTP::RTP(); void RTP::init(int payload_type_); void RTP::next(int frameduration); int RTP::check(int payload_type); unsigned long RTP::getssrc() { return ssrc; }; }; const unsigned PT_PCMU = 0; // 8000 sample/second, G.711 u-Law const unsigned PT_iLBC = 96; // inofficial RTP payload type char *RTP_network_copy(char *to, RTP * from); char *RTP_host_copy(RTP * to, char *from); /****************** APPENDIX tcp.h ******************/ /* tcp.cpp * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * TCP server functions for IPv4 * * Version 1.0 */ int tcp_server_init(int port); int tcp_server_init2(int listen_fd); /****************** APPENDIX tcp.cpp ******************/ /* tcp.cpp * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * TCP server functions for IPv4 * * Version 1.0 */ #include #include #include /* Socket io */ #include #include #include #include #include #include #include #include #include #include /* error handling */ #include #include #include "intercomd.h" int tcp_server_init(int port) /* open the server (listen) port - do this one time*/ { int fd = socket(PF_INET, SOCK_STREAM, 0); assert_errno(fd >= 0); struct sockaddr_in sock; memset((char *) &sock, 0, sizeof(sock)); sock.sin_family = AF_INET; sock.sin_addr.s_addr = htonl(INADDR_ANY); sock.sin_port = htons(port); if (bind(fd, (struct sockaddr *) &sock, sizeof(sock)) < 0) { fprintf(stderr, "tcp_recv_init(): bind() failed\n"); exit(2); } int ret = listen(fd, 1); assert_errno(ret >= 0); return fd; } int tcp_server_init2(int listen_fd) /* open the communication (connection) - do this for every client */ { // fprintf(stderr, "tcp_server_init2\n"); // avoid blocking accept() int flags = fcntl(listen_fd, F_GETFL); int ret = fcntl(listen_fd, F_SETFL, flags | O_NONBLOCK); assert_errno(ret >= 0); struct sockaddr_in sock; socklen_t socklen = sizeof(sock); int fd = accept(listen_fd, (struct sockaddr *) &sock, &socklen); assert_errno(fd >= 0); return fd; } /****************** APPENDIX udp.h ******************/ /* udp.h * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * UDP functions for IPv4 * * Version 1.0 */ class UDP { int local_fd; struct sockaddr_in foreign_sock; public: void send_init(char *foreign_name, int foreign_port, int local_fd_); void send(char *buf, int bytes); void send_close(); }; int UDP_recv_init(int port); // IP to ASCII char *iptoa(char *buf, in_addr_t ip); // ASCII to IP in_addr_t atoip(char *buf); /****************** APPENDIX udp.cpp ******************/ /* udp.cpp * * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. * * UDP functions for IPv4 * Multicast Doc: /usr/share/doc/howto/en/html/Multicast-HOWTO.html * * Version 1.0 */ #include #include #include /* Socket io */ #include #include #include #include #include #include #include #include #include "udp.h" #include "intercomd.h" int UDP_recv_init(int port) /* starte UDP LAN Empfang. Programm-Exit bei Fehler! */ /* return: Filedescriptor fuer Empfang */ /* in port: UDP Port Nummer */ { int fd; struct sockaddr_in sock; int local_flag; socklen_t local_flagsize; int reuse_flag, reuse_len; /* init UDP local1 */ fd = socket(AF_INET, SOCK_DGRAM, 0); if (fd < 0) { fprintf(stderr, "if_recv_init(): socket() failed\n"); exit(1); } memset((char *) &sock, 0, sizeof(sock)); sock.sin_family = AF_INET; sock.sin_addr.s_addr = htonl(INADDR_ANY); sock.sin_port = htons(port); if (bind(fd, (struct sockaddr *) &sock, sizeof(sock)) < 0) { fprintf(stderr, "if_recv_init(): bind() failed\n"); exit(2); } local_flagsize = sizeof(int); if (getsockopt(fd, SOL_SOCKET, SO_RCVBUF, (char *) &local_flag, &local_flagsize) < 0) { fprintf(stderr, "if_recv_init(): getsockopt() failed\n"); exit(3); } /* printf("SO_RCVBUF=%d\n", local_flag); */ /* printf("init socket\n"); */ return fd; } void UDP::send_init(char *foreign_name, int foreign_port, int local_fd_) /* Starte Versand von UDP Meldungen auf LAN */ /* in foreign_name: IP-Adresse auf die gesendet wird */ /* in foreign_port: UDP-Port auf den gesendet wird */ { struct sockaddr_in local_sock; struct hostent *foreign_host; int local_flag; socklen_t local_flagsize; local_fd = local_fd_; /* turn on broadcast */ local_flagsize = sizeof(int); if (getsockopt(local_fd, SOL_SOCKET, SO_BROADCAST, (char *) &local_flag, &local_flagsize) < 0) { fprintf(stderr, "udp_send_init(): getsockopt() failed\n"); exit(3); } if (local_flag == 0) { local_flag = 1; setsockopt(local_fd, SOL_SOCKET, SO_BROADCAST, (char *) &local_flag, sizeof(int)); local_flagsize = sizeof(int); if (getsockopt(local_fd, SOL_SOCKET, SO_BROADCAST, (char *) &local_flag, &local_flagsize) < 0) { fprintf(stderr, "udp_send_init() SO_BROADCAST failed\n"); exit(3); } } /* init foreign part */ memset((char *) &foreign_sock, 0, sizeof(foreign_sock)); foreign_host = gethostbyname(foreign_name); if (foreign_host == NULL || foreign_host->h_length == 0) { fprintf(stderr, "udp_send_init(): gethostbyname() failed"); exit(1); } memcpy(&foreign_sock.sin_addr.s_addr, foreign_host->h_addr_list[0], foreign_host->h_length); foreign_sock.sin_family = AF_INET; foreign_sock.sin_port = htons(foreign_port); in_addr_t to_ip = ntohl(foreign_sock.sin_addr.s_addr); char s[20]; print_gui("c %s\n", iptoa(s, to_ip)); } void UDP::send(char *buf, int bytes) /* Sende UDP Meldung auf LAN */ /* in buf: Meldung */ /* in bytes: Laenge der Meldung */ { int len; len = sendto(local_fd, buf, bytes, 0, (struct sockaddr *) &foreign_sock, sizeof(foreign_sock)); if (len != bytes) { fprintf(stderr, "udp_send(): sendto() foreign failed ret=%d\n", len); } } void UDP::send_close() { memset((char *) &foreign_sock, 0, sizeof(foreign_sock)); } // IP to ASCII char *iptoa(char *buf, in_addr_t ip) { int i1 = ip >> 24; int i2 = (ip >> 16) & 0xFF; int i3 = (ip >> 8) & 0xFF; int i4 = ip & 0xFF; sprintf(buf, "%d.%d.%d.%d", i1, i2, i3, i4); return buf; } // ASCII to IP in_addr_t atoip(char *buf) { int i1, i2, i3, i4; sscanf(buf, "%d.%d.%d.%d", &i1, &i2, &i3, &i4); return i1 << 24 | i2 << 16 | i3 << 8 | i4; } /****************** APPENDIX intercom.tcl ******************/ #!/usr/bin/wish # intercom.tcl # # Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved. # # Voice-over-IP Intercom Graphical User Interface # # Version 1.1 with short/long keypress proc t_color {t state} { switch $state { 0 { .$t configure -foreground black ;# nothing .$t configure -activeforeground black .$t configure -background "#d9d9d9" .$t configure -activebackground "#d9d9d9"} 1 { .$t configure -foreground black ;# transmit .$t configure -activeforeground black .$t configure -background yellow .$t configure -activebackground yellow} 2 { .$t configure -foreground black ;# receive .$t configure -activeforeground black .$t configure -background magenta .$t configure -activebackground magenta} 3 { .$t configure -foreground black ;# full duplex .$t configure -activeforeground black .$t configure -background green .$t configure -activebackground green} } } proc keyPress {t} { global tmap state sock mode tping set mode($t) 0 set ip $tmap($t) switch $state($t) { 0 {set cmd c after 300 [list set mode($t) 1]} 1 {set cmd h} 2 {set cmd c after 300 [list set mode($t) 1]} 3 {set cmd h} } # puts "$cmd $ip" puts $sock "$cmd $ip" flush $sock } proc keyRelease {t} { global mode if {$mode($t)} { keyPress $t } } proc tx_begin {ip} { global ipmap state set t $ipmap($ip) puts "tx_begin $ip $t" switch $state($t) { 0 {set state($t) 1} 1 { } 2 {set state($t) 3} 3 { } } t_color $t $state($t) } proc rx_begin {ip} { global ipmap state set t $ipmap($ip) puts "rx_begin $ip $t" switch $state($t) { 0 {set state($t) 2} 1 {set state($t) 3} 2 { } 3 { } } t_color $t $state($t) } proc tx_end {ip} { global ipmap state set t $ipmap($ip) puts "tx_end $ip $t" switch $state($t) { 0 {} 1 {set state($t) 0} 2 { } 3 {set state($t) 2} } t_color $t $state($t) } proc rx_end {ip} { global ipmap state set t $ipmap($ip) puts "rx_end $ip $t" switch $state($t) { 0 { } 1 { } 2 {set state($t) 0} 3 {set state($t) 1} } t_color $t $state($t) } proc recv {} { global sock gets $sock cmd # puts $cmd set argv [split $cmd] # puts $argv set ip [lindex $argv 1] switch [lindex $argv 0] { c {tx_begin $ip} r {rx_begin $ip} h {tx_end $ip} d {rx_end $ip} } } # include GUI source intercom.ui.tcl intercom_ui . proc guiconfig {t text ip} { global state mode ipmap tmap set state($t) 0 set mode($t) 0 set ipmap($ip) $t set tmap($t) $ip .$t configure -text $text -highlightthickness 12 .$t configure -command [list keyRelease $t] bind .$t [list keyPress $t] } # include configuration source intercom.config.tcl # init TCP connection to intercomd set sock [socket 127.0.0.1 4999] fileevent $sock readable recv set nodename [exec uname -n] wm title . "intercom $nodename" /****************** APPENDIX intercom.ui.tcl ******************/ #! /bin/sh # the next line restarts using wish \ exec wish "$0" "$@" # interface generated by SpecTcl version 1.1 from /home/anblf/voipconf/intercom.ui # root is the parent window for this user interface proc intercom_ui {root args} { # this treats "." as a special case if {$root == "."} { set base "" } else { set base $root } canvas $base.canvas#1 \ -height 0 \ -width 400 canvas $base.canvas#2 \ -height 160 \ -width 0 button $base.t1 \ -text t1 button $base.t2 \ -text t2 button $base.t3 \ -text t3 button $base.t4 \ -text t4 button $base.t5 \ -text t5 button $base.t6 \ -text t6 button $base.t7 \ -text t7 button $base.t8 \ -text t8 # Geometry management grid $base.canvas#1 -in $root -row 1 -column 2 \ -columnspan 4 \ -sticky nesw grid $base.canvas#2 -in $root -row 2 -column 1 \ -rowspan 2 \ -sticky nesw grid $base.t1 -in $root -row 2 -column 2 grid $base.t2 -in $root -row 2 -column 3 grid $base.t3 -in $root -row 2 -column 4 grid $base.t4 -in $root -row 2 -column 5 grid $base.t5 -in $root -row 3 -column 2 grid $base.t6 -in $root -row 3 -column 3 grid $base.t7 -in $root -row 3 -column 4 grid $base.t8 -in $root -row 3 -column 5 # Resize behavior management grid rowconfigure $root 1 -weight 0 -minsize 2 grid rowconfigure $root 2 -weight 0 -minsize 30 grid rowconfigure $root 3 -weight 0 -minsize 30 grid columnconfigure $root 1 -weight 0 -minsize 2 grid columnconfigure $root 2 -weight 0 -minsize 30 grid columnconfigure $root 3 -weight 0 -minsize 30 grid columnconfigure $root 4 -weight 0 -minsize 30 grid columnconfigure $root 5 -weight 0 -minsize 30 # additional interface code # end additional interface code } # Allow interface to be run "stand-alone" for testing catch { if [info exists embed_args] { # we are running in the plugin intercom_ui . } else { # we are running in stand-alone mode if {$argv0 == [info script]} { wm title . "Testing intercom_ui" intercom_ui . } } }