Draft                                                   Andre Adrian
   Document: draft-conference-01.txt         DFS Deutsche Flugsicherung
   Category: Experimental       
   november 23th, 2004               
   Expires: ?
                                
                                
             Voice over Internet Unicast telephone conference
                               
  
Status of this Memo

  This document specifies a telephone conference implementation for
  hands-free Voice over Internet telephony and requests discussion and
  suggestions for improvements.
  Distribution of this memo is unlimited.
  
  
Copyright Notice
   
  Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
  
  You are allowed to use this source code in any open source or closed 
  source software you want. You are allowed to use the algorithms for a
  hardware solution. You are allowed to modify the source code.
  You are not allowed to remove the name of the author from this memo or
  from the source code files. You are not allowed to monopolize the
  source code or the algorithms behind the source code as your 
  intellectual property. This source code is free of royalty and comes
  with no warranty.
  
  
Abstract

  This memo describes a VoIP telephone conference algorithm based on 
  build-in Multi Conferencing Unit with unicast messages. The one-time 
  transmission time is evaluated. An implementation in C++ for the Linux   
  operation system is discussed.


Introduction

  The RTP protocol (RFC 3550) describes two possibilities to perform a
  telephone conference. First possibility is using multicast UDP
  messages. Second possibility is to use unicast UDP from every node
  to every other node (3 participants need 3 unicast links, 4 
  participants need 6 unicast links, l = n*(n-1)/2).
  This memo describes another possibility to do telephone conference 
  with unicast messages. For 3 participants we need 2 unicast links, for
  4 participants we need 3 links, that is l = n-1. This solution is 
  called "build-in Multi Conferencing Unit with unicast messages" in 
  ITU-T H.323 documentation.
  As always you have to pay a price. In our case there are 2 
  disadvantages: First, the audio mixing is done in one node only. If 
  this node quits the telephone conference, the telephone conference is 
  terminated. Second, speaking from one leaf-node to another leaf-node 
  needs 2 audio compression/decompression actions.
  Both disadvantages are typical for MCU (multi conferencing unit) 
  solutions.
  
  
Audio mixing

  The core of every telephone conference is the audio mixing. The audio
  mixer is a device with N audio inputs and N audio outputs. Normally
  the signal on output i is the sum of all input signals without the 
  signal from input i.
  In Fig. 1 every node has a 3-input/3-output audio mixer. The 
  connections between A and B and between B and C are the unicast links 
  mentioned above. Node B is the central node, nodes A and C are leaf 
  nodes. If user A speaks in his microphone, the audio mixer A will 
  forward this audio signal to output 2. Output 2 of mixer A is 
  connected to input 3 of mixer B. Output 2 of mixer B is the sum of 
  input 1 (user B talking) and input 3 (user A talking). This output 2 
  of mixer B is connected to input 2 of mixer C. Output 1 of mixer C is 
  the speaker - and finally user C hears user A talking.  
                            
                            +-----------+
                            |out3    in3|
                            |           |
                +--------<--|out2 A  in2|--<--------+
                |           |           |           |
                |   Spk--<--|out1    in1|--<--Mic   |
                |           +-----------+           |
                |                                   |
                |           +-----------+           |
                +-------->--|in3    out3|-->--------+
                            |           |           
                +-------->--|in2  B out2|-->--------+
                |           |           |           |
                |   Mic-->--|in1    out1|-->--Spk   |
                |           +-----------+           |
                |                                   |
                |           +-----------+           |
                |           |out3    in3|           |
                |           |           |           |
                +--------<--|out2 C  in2|--<--------+
                            |           |
                    Spk--<--|out1    in1|--<--Mic
                            +-----------+
                           
             Figure 1: Central Node B with Leaf Nodes A and C
             

Audio mixing of audio frames (packets)

  In a Voice-over-IP system we have audio frames. These audio frames 
  contain audio samples for 20 milliseconds or so and have to come in 
  every 20 milliseconds to produce a constant audio feed to the speaker.
  The audio mixer above is a pure sample-by-sample mixer and does not 
  know about audio frames.
  The handling of audio frames is done with circular buffers. These 
  buffers are located at the < and > positions in Fig. 1. We need the 
  buffers also because frames have different size. The audio frames of
  the OSS (Open Sound System) are limited to powers of 2, like 4ms, 8ms,
  16ms. The audio frames on the network contain 20ms, 30ms or 80ms of 
  speech as defined in the codec documentation or in RFC3551.
  To have minimum latency the software uses 4ms OSS frames for 20ms 
  network frames. For 40ms network frames it is possible to use 8ms OSS 
  frames, for 80ms network frames we can use 16ms OSS frames.
  The creating of frames is driven by the microphone. The OSS system 
  will create a software interrupt for every microphone frame. 
  Note: Because the reason for this software interrupt is a hardware 
  interrupt, the jitter of this interrupt is less then the jitter of a 
  operation system timer interrupt - at least for Linux.
  With an actual microphone audio frame the audio mixer can perform the
  mixing. The other network inputs are used as available - there is no 
  waiting. The mixer creates a microphone frame size output. This output   
  is added to the output circular buffers. If an individual output 
  buffer has enough data to fill a network frame, the codec compression 
  is done and a network packet is send.


Latency of build-in Multi-Conferencing with unicast messages

  Everybody knows that a 20 millisecond network audio frame will give 
  you an one-way transmission time of at least 20 milliseconds. Most 
  people will further agree that RTT (round trip time) divided by 2 is
  an (good enough) estimate for the transport delay.
  This section will investigate the impact of circular buffers on the 
  real one-way transmission time.
  First we look at a normal telephone call. The microphone buffer needs 
  4ms (or 8ms, 16ms). The codec needs 20ms. Together we have 24ms in the   
  transmitting node. The receiving node uses a speaker buffer that is 
  equal to the microphone buffer.
  Together we have 2 times audio buffer, the codec frame time and the 
  transport time in the network. This calculation is quite different to 
  the ITU-T calculation in G.114. For G.114 the audio buffer time is 
  equal to the codec frame time.
  Now we look at a telephone conference. The one-way transmission time 
  from and to the central node are equal to the normal telephone call 
  case.
  The leaf-node to leaf-node one-way transmission time is worse. Because   
  the central node has to read a network audio frame, mix the contents 
  and write another network audio frame, the calculation is 2 times 
  audio buffer, 2 times the codec frame time and 2 times the transport 
  time in the network.
  
  
Latency of Intercom conference call

  To limit one-way transmission time the "everybody talks/listens with 
  everybody else" approach of a telephone conference call can be reduced   
  to an intercom conference call. In intercom conference call the 
  leaf-nodes do not talk/listen to each other, there is only talk/listen   
  from and to the central node. If users are satisfied with this limited   
  conference call the one-way transmission time is again the number we 
  calculated for normal telephone call.
  

Software design

  The application is split into three processes. The application start
  skript intercom, the audio and network process intercomd, written in
  C++, and the graphical user interface intercom.tcl, written in Tcl/Tk.
  Note: The libraries for graphical user interfaces often make it
  impossible to use the select() command. Therefore the application was
  split into a realtime process intercomd and a GUI process
  intercom.tcl.
  The start skript intercom does set up the hardware audio mixer for
  hardware acoustic echo cancellation. Then it starts the intercomd
  process and gives this process realtime priority. At last, the GUI
  process is started.
  The audio network packets use the iLBC codec to save network
  bandwidth. The software sends every 80ms a network packet. In the
  network packet there are 4 iLBC frames each 20ms long. This is not
  good for packet loss handling, but good for network bandwidth - with
  all headers (IPv4, UDP, RTP) we have 21kBit/s for one audio link.
  With a network packet every 20ms the bandwidth is 37kBit/s. The iLBC
  codec itself needs 15.2kBit/s.


Interprocess communication

  The audio and network process intercomd listens as TCP server on port
  4999. The GUI process intercom.tcl operates as TCP client. Both
  processes are normally running on the same computer and can 
  communicate via localhost (IP address 127.0.0.1). It is possible to 
  run both processes on different computers.
  To transport audio network packets the Real Time Protocol (RFC3550) is
  used with IPv4 unicast UDP network packets.
  
  
Program options
  
  The application does understand the following options:
  
  -a Value
      set ambient background noise to dezibel value. For optimum 
      acoustic echo cancellation the background noise level should be 
      given. The application does measure the noise level every 5 
      seconds if there is a connection to another intercom and writes
      an "Ambient =" message to standard error. 
     
  -l
      use hardware AEC and Line-in connector for the microphone. This
      option was tested with Sennheiser microphone capsule ME34,
      Sennheiser gooseneck MZH3040 and Behringer microphone amplifier
      MIC100.
      
  -m
      use hardware AEC and Mic-in connector for microphone. This option
      was tested with Labtec Mic 333.
      
  -p Portnumber
      set RTP Portnumber. Default is 5000. All audio communication is
      done on this port number. Source and destination port number are
      equal.
      
  -t
      do telephone conference call. Everybody can listen/talk to
      everybody else. Default is intercom conference call - see above.

  
Program configuration

  The program assumes that there is only one intercom user for every IP
  address. The GUI process handles the Name-to-IP translation with a
  fixed table. The Name is the text on the direct access button.
  See the intercom.tcl source code for details of the - very primitive -
  translation. You will have to change the intercom.tcl file for your IP
  addresses!
  
  
Appendix

  The following appendices contain the source code. The soucre code was
  compiled and tested on some IA32 computers (Intel Celeron, Intel
  Centrino, AMD Athlon, AMD Athlon XP). Operating System was Linux Kernel
  2.6 (SuSE distribution 9.1) with GCC 3.3.3, Tcl/Tk 8.4, OSS 3.8.2 and 
  ALSA 1.0.3.
  Next to the compile tools the RPM packages rtstools and alsa are
  needed. Further you need the iLBC codec sources from
  http://www.ietf.org/internet-drafts/draft-ietf-avt-ilbc-codec-05.txt


Compile

  cd ilbc
  make
  cd ..
  c++ -O2 -o intercomd aec.cpp cirbuf.cpp oss.cpp rtp.cpp tcp.cpp \
  udp.cpp intercomd.cpp ilbc/ilbc.a -lm
  
  
Run

  ./intercom
  
  Note: Read shell script intercom to get an idea what is going on. For 
  a first test talk to yourself by clicking the button for your own
  IP-address, e.g. EDDF TEC2 if your computer is 192.168.1.2.
  The button should become green.
  Attention: A short click (less then 300ms) is a toggle switch, a long
  click is a push-to-talk.


/****************** APPENDIX intercom ******************/

#!/bin/bash

# intercom
#
# Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
#
# Voice-over-IP Intercom start skript for Linux with ALSA
#
# Version 1.0

echo "usage: intercom [OPTIONS] [Partner1-IP [Partner2-IP ...]]"
echo " -a value    set ambient (background) noise to value dezibel"
echo " -l          use hardware AEC and Line-in for microphone"
echo " -m          use hardware AEC and Mic-in for microphone"
echo " -p Number   RTP Portnumber (default is 5000)"
echo " -t          Telephone conference call (everybody with everybody)"
echo ""

# delete old process
killall -9 intercomd 2>/dev/null

# configure audio mixer (AC97 compatible)
# check your mixer hardware with:
# 	cat /dev/sndstat
#
# hardware AEC test successful with on-board sound mixers:
# Analog Devices AD1985, ICEnsemble ICE1232, Realtek ALC650 and ALC655, 
# SigmaTel STAC9750/51
#
# hardware AEC test successful with PCI sound cards:
# Soundblaster PCI128
#
# hardware AEC test failed with PCI sound cards:
# Soundblaster Audigy 2, C-Media 8738

# set playback volume
amixer -q sset 'PCM',0 70%
amixer -q sset 'Master',0 70%

# use only PCM for playback
amixer -q set 'Master',0 unmute
amixer -q set 'PCM',0 unmute
amixer -q set 'Mic',0 mute
amixer -q set 'Line',0 mute
amixer -q set 'CD',0 mute
amixer -q set 'Aux',0 mute

# enable recording
amixer -q cset iface=MIXER,name='Capture Switch' 1

# handle options
for argv in $* ;do
  # echo $argv
  case $argv in
  ("-l")
    # for Hardware AEC and Line-In Capture
    amixer -q set 'Capture',0 0%-,100%-
    amixer -q cset iface=MIXER,name='Capture Source' 4,5
  ;;
  ("-m")
    # for Hardware AEC and Mic-In Capture
    amixer -q set 'Capture',0 0%-,100%-
    amixer -q cset iface=MIXER,name='Capture Source' 0,5
  ;;
  esac
done


# start audio/network daemon
./intercomd $* &
sleep 1

# give audio/network daemon realtime process prio
sudo /usr/sbin/setpriority `pidof intercomd` fifo 1

# To make sudo work without password add as superuser with the program 
# visudo:
# %users ALL=(root) NOPASSWD: /usr/sbin/setpriority

# start graphical user interface
./intercom.tcl $* &


/****************** APPENDIX intercomd.h ******************/

/* intercomd.h
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Voice over IP Intercom with Telephone conference and Acoustic Echo
 * Cancellation using unicast RTP messages (RFC3550)
 *
 * Version 1.0
 */
#ifndef _INTERCOMD_H

#define ERROR (-1)
#define OKAY 0

#define NO 0
#define YES 1

/* Emit program info and abort the program if expr is false with errno */
#define assert_errno(expr) \
if(!(expr)) { \
  fprintf(stderr, "voipconf: %s:%d: %s: Assertion '%s' failed. errno=%s\n", \
  __FILE__, __LINE__, __PRETTY_FUNCTION__, __STRING(expr), strerror(errno)); \
  exit(1); \
}

/* Emit program info and return function if expr is true with retvalue */
#define return_if(expr, retvalue) \
if(expr) { \
  fprintf(stderr, "voipconf: %s:%d: %s: Check '%s' failed.\n", \
  __FILE__, __LINE__, __PRETTY_FUNCTION__, __STRING(expr)); \
  return(retvalue); \
}

int print_gui(const char *fmt, ...);

#define _INTERCOMD_H
#endif


/****************** APPENDIX intercomd.cpp ******************/

/* intercomd.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Voice over IP Intercom with Telephone conference and Acoustic Echo
 * Cancellation using unicast RTP messages (RFC3550)
 *
 * Attention. This source code is not very portable! You need:
 *  iLBC Codec Sourcecode draft-ietf-avt-ilbc-codec-05.txt
 *  same endian for CPU and soundcard for 16bit audio sample
 *  Open Source Sound (OSS) support
 *  ALSA Sound support for hardware (2-channel) AEC
 * 
 * Compile Sourcecode: 
c++ -O2 -o intercomd aec.cpp cirbuf.cpp oss.cpp rtp.cpp tcp.cpp udp.cpp \
intercomd.cpp ilbc/ilbc.a -lm
 *
 * Format Sourcecode:
indent -kr -i2 -nlp -ci2 -l72 -lc72 -nut voipconf.cpp
 *
 * To be done:
 * Sometimes click noise with telephone conference after adding 3. node
 * Packet loss concealment handling
 * Better Jitter buffer handling
 * open/close audio io on demand
 *
 * Version 1.0
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <stdarg.h>

/* Error handling */
#include <assert.h>
#include <errno.h>
extern int errno;

/* low level io */
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/time.h>

/* Socket io */
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

/* iLBC codec */
#include "ilbc/iLBC_define.h"
#include "ilbc/iLBC_encode.h"
#include "ilbc/iLBC_decode.h"

#include "rtp.h"
#include "udp.h"
#include "tcp.h"
#include "aec.h"
#include "oss.h"
#include "cirbuf.h"
#include "intercomd.h"

/* Design Constants */
#define PARTNERS    5           /* maximum telephony partners */
/* End of Design Constants */

#define FORMAT_RTP  PT_iLBC
#define ILBC_MODE   20
#define ILBCSIZE    NO_OF_BYTES_20MS
#define FRAMESIZE   (20*8*2)    /* compression frame size */
#define MTU	    1460        /* Maximum Transfer Unit */

#define PACKETDURATION	(FRAMES*20*8)


/* audio */
static CIRBUF mic_cirbuf;
static CIRBUF spk_cirbuf;
static AEC aec;
static int channels = 1;

/* network transmitting to partners */
static in_addr_t to_ip[PARTNERS];
static RTP to_rtp[PARTNERS];
static UDP to_udp[PARTNERS];
static iLBC_Enc_Inst_t Enc_Inst[PARTNERS];
static CIRBUF conf_cirbuf[PARTNERS];
static char tx_buf[PARTNERS][MTU];
static char *tx_pbuf[PARTNERS];
static int tx_frames[PARTNERS];
static int to_partners = 0;
static int telephone_conference = 0;

/* network receiving from partners */
static in_addr_t from_ip[PARTNERS];
static unsigned long from_ssrc[PARTNERS];
static int from_cnt[PARTNERS];
static CIRBUF from_cirbuf[PARTNERS];
static iLBC_Dec_Inst_t Dec_Inst[PARTNERS];
static int from_partners = 0;


/*----------------------------------------------------------------*
*  Encoder interface function 
*---------------------------------------------------------------*/

short encode(                   /* (o) Number of bytes encoded */
  iLBC_Enc_Inst_t * iLBCenc_inst,       /* (i/o) Encoder instance */
  short *encoded_data,          /* (o) The encoded bytes */
  short *data                   /* (i) The signal block to encode */
  )
{
  float block[BLOCKL_MAX];
  int k;


  /* convert signal to float */

  for (k = 0; k < iLBCenc_inst->blockl; k++)
    block[k] = (float) data[k];

  /* do the actual encoding */

  iLBC_encode((unsigned char *) encoded_data, block, iLBCenc_inst);


  return (iLBCenc_inst->no_of_bytes);
}

int tx_buf_init(int i)
{
  tx_pbuf[i] = tx_buf[i];
  tx_frames[i] = 0;
}

int audio_read(int audio_fd)
{
  /* fat software interrupt routine: read audio, send UDP packets */

  short mic_buf[FRAGSIZE / 2];
  if (1 == channels) {
    size_t len = read(audio_fd, mic_buf, FRAGSIZE);
    return_if(len != FRAGSIZE, ERROR);

    if (0 == to_partners) {
      /* start assembling send packets only if we have a target */
      return OKAY;
    }

    short spk_buf[FRAGSIZE / 2];
    spk_cirbuf.pop((char *) spk_buf, FRAGSIZE);

    /* Acoustic Echo Cancellation - using software buffers */
    int i;
    for (i = 0; i < FRAGSIZE / 2; ++i) {
      mic_buf[i] = aec.doAEC(mic_buf[i], spk_buf[i]);
    }
  } else {
    short mic2_buf[FRAGSIZE];
    size_t len = read(audio_fd, mic2_buf, 2 * FRAGSIZE);
    return_if(len != 2 * FRAGSIZE, ERROR);

    if (0 == to_partners) {
      /* start assembling send packets only if we have a target */
      return OKAY;
    }

    /* Acoustic Echo Cancellation - using hardware audio mixer */
    int i;
    for (i = 0; i < FRAGSIZE / 2; ++i) {
      mic_buf[i] = aec.doAEC(mic2_buf[2 * i], mic2_buf[2 * i + 1]);
    }
  }

  int ret = mic_cirbuf.push((char *) mic_buf, FRAGSIZE);
  if (ret < 0) {
    fprintf(stderr, "mic_cirbuf.push overrun\n");
  }

  if (mic_cirbuf.getlen() >= FRAMESIZE) {
    /* My RFC3551 interpretation: Only one RTP Header for packets 
     * with a number of frames */
    int i;
    for (i = 0; i < PARTNERS; ++i) {
      if (to_ip[i] && 0 == tx_frames[i]) {
        /* put RTP header */
        tx_pbuf[i] = RTP_network_copy(tx_pbuf[i], &to_rtp[i]);
      }
    }

    /* put payload (audio) */
    short micbuf[FRAMESIZE / 2];
    mic_cirbuf.pop((char *) micbuf, FRAMESIZE);

    if (telephone_conference) {
      /* telephone conference mix - everybody from/to everybody else */
      short from_buf[PARTNERS][FRAMESIZE / sizeof(short)];
      short sum_buf[FRAMESIZE / sizeof(short)];
      int i, j, k;

      /* get audio from other partners */
      for (i = 0; i < PARTNERS; ++i) {
        conf_cirbuf[i].pop((char *) from_buf[i], FRAMESIZE);
      }

      for (i = 0; i < PARTNERS; ++i) {
        if (to_ip[i]) {
          for (j = 0; j < FRAMESIZE / sizeof(short); ++j) {
            /* mix */
            long sum = micbuf[j];
            for (k = 0; k < PARTNERS; ++k) {
              if (to_ip[i] != from_ip[k]) {   /* do not mix in origin */
                sum += from_buf[k][j];
              }
            }
            /* clip */
            if (sum > 32767) {
              sum_buf[j] = 32767;
            } else if (sum < -32767) {
              sum_buf[j] = -32767;
            } else {
              sum_buf[j] = sum;
            }
          }
          /* do encoding (audio compression) */
          short encoded_data[ILBCSIZE / 2];
          int len = encode(&Enc_Inst[i], encoded_data, sum_buf);

          /* distribute to transmit buffers */
          memcpy(tx_pbuf[i], encoded_data, len);
          tx_pbuf[i] += len;
          assert(tx_pbuf[i] - tx_buf[i] <= MTU);
        }
      }
    } else {
      /* intercom conference mixing - central node from/to other nodes */
      /* do encoding (audio compression) */
      short encoded_data[ILBCSIZE / 2];
      int len = encode(&Enc_Inst[0], encoded_data, micbuf);

      /* distribute to transmit buffers */
      int i;
      for (i = 0; i < PARTNERS; ++i) {
        if (to_ip[i]) {
          memcpy(tx_pbuf[i], encoded_data, len);
          tx_pbuf[i] += len;
          assert(tx_pbuf[i] - tx_buf[i] <= MTU);
        }
      }
    }

    /* transmit data packet(s) */
    for (i = 0; i < PARTNERS; ++i) {
      if (to_ip[i] && ++tx_frames[i] >= FRAMES) {
        to_udp[i].send(tx_buf[i], tx_pbuf[i] - tx_buf[i]);

        /* prepare next go */
        tx_buf_init(i);
        to_rtp[i].next(PACKETDURATION);
      }
    }
  }
}

int partner_timeout()
{
  /* Delete old from_ssrc[] entries - this is not very quick! */
  int i;
  for (i = 0; i < PARTNERS; ++i) {
    if (from_ssrc[i] && from_cnt[i] == 0) {
      char s[20];
      print_gui("d %s\n", iptoa(s, from_ip[i]));
      from_ssrc[i] = 0;
      from_ip[i] = 0;
      --from_partners;
    }
    from_cnt[i] = 0;
  }
}

int partner_lookup(unsigned long ssrc, in_addr_t ip)
{
  /* search */
  int i;
  for (i = 0; i < PARTNERS; ++i) {
    if (from_ssrc[i] == ssrc) {
      ++from_cnt[i];
      return i;                 /* old entry */
    }
  }
  /* add new entry */
  for (i = 0; i < PARTNERS; ++i) {
    if (0 == from_ssrc[i]) {
      if (0 == from_partners) {
        spk_cirbuf.init();
      }
      from_ssrc[i] = ssrc;
      from_ip[i] = ip;
      from_cnt[i] = 1;
      initDecode(Dec_Inst + i, ILBC_MODE, 1);
      conf_cirbuf[i].init();
      ++from_partners;
      char s[20];
      print_gui("r %s\n", iptoa(s, ip));
      return i;
    }
  }
  return ERROR;
}

void audio_write(int audio_fd)
{
  int i, j;
  short from_buf[PARTNERS][FRAGSIZE / sizeof(short)];
  short sum_buf[FRAGSIZE / sizeof(short)];

  /* get audio */
  for (i = 0; i < PARTNERS; ++i) {
    from_cirbuf[i].pop((char *) from_buf[i], FRAGSIZE);
  } for (j = 0; j < FRAGSIZE / sizeof(short); ++j) {
    /* mix */
    long sum = 0;
    for (i = 0; i < PARTNERS; ++i) {
      sum += from_buf[i][j];
    }
    /* clip */
    if (sum > 32767) {
      sum_buf[j] = 32767;
    } else if (sum < -32767) {
      sum_buf[j] = -32767;
    } else {
      sum_buf[j] = sum;
    }
  }

  if (1 == channels) {
    if (from_partners > 0) {
      /* save for 1-channel AEC */
      int ret = spk_cirbuf.push((char *) sum_buf, FRAGSIZE);
      if (ret < 0) {
        /* fprintf(stderr, "spk_cirbuf.push overrun\n"); */
      }
    }
    write(audio_fd, sum_buf, FRAGSIZE);
  } else {
    short sum2_buf[FRAGSIZE];
    int i;
    for (i = 0; i < FRAGSIZE / 2; ++i) {
      sum2_buf[2 * i] = 0;      /* left channel silence */
      sum2_buf[2 * i + 1] = sum_buf[i]; /* right channel spk */
    }
    write(audio_fd, sum2_buf, 2 * FRAGSIZE);
  }
}

/*----------------------------------------------------------------*
*  Decoder interface function 
*---------------------------------------------------------------*/

short decode(                   /* (o) Number of decoded samples */
  iLBC_Dec_Inst_t * iLBCdec_inst,       /* (i/o) Decoder instance */
  short *decoded_data,          /* (o) Decoded signal block */
  short *encoded_data,          /* (i) Encoded bytes */
  short mode                    /* (i) 0=PL, 1=Normal */
  )
{
  int k;
  float decblock[BLOCKL_MAX], dtmp;

  /* check if mode is valid */

  if (mode < 0 || mode > 1) {
    printf("\nERROR - Wrong mode - 0, 1 allowed\n");
    exit(3);
  }

  /* do actual decoding of block */
  iLBC_decode(decblock, (unsigned char *) encoded_data,
    iLBCdec_inst, mode);

  /* convert to short */

  for (k = 0; k < iLBCdec_inst->blockl; k++) {
    dtmp = decblock[k];

    if (dtmp < MIN_SAMPLE)
      dtmp = MIN_SAMPLE;
    else if (dtmp > MAX_SAMPLE)
      dtmp = MAX_SAMPLE;
    decoded_data[k] = (short) dtmp;
  }

  return (iLBCdec_inst->blockl);
}

static unsigned short rtp_port = 5000;

int udp_read(int udp_fd)
{
  /* software interrupt routine */
  char buf[MTU];
  struct sockaddr_in from_sock;
  socklen_t from_socklen = sizeof(sockaddr_in);
  RTP rtp_in;

  int len = recvfrom(udp_fd, buf, MTU, 0,
    (struct sockaddr *) &from_sock, &from_socklen);
  return_if(sizeof(RTP) + FRAMES * ILBCSIZE != len, ERROR);

  /* check Port number */
  in_addr_t from_ip = ntohl(from_sock.sin_addr.s_addr);
  in_port_t from_port = ntohs(from_sock.sin_port);
  return_if(from_port != rtp_port, ERROR);

  char *pbuf = RTP_host_copy(&rtp_in, buf);
  int rc = rtp_in.check(FORMAT_RTP);
  return_if(rc, ERROR);

  int partner = partner_lookup(rtp_in.getssrc(), from_ip);
  return_if(partner < 0, ERROR);

  int i;
  for (i = 0; i < FRAMES; ++i, pbuf += ILBCSIZE) {
    /* do decoding (audio decompression) */
    /* tbd.: Packet loss concealment */
    short decoded_data[FRAMESIZE / 2];
    len = decode(Dec_Inst + partner, decoded_data, (short *) pbuf, 1);
    return_if(len != FRAMESIZE / 2, ERROR);

    int ret = from_cirbuf[partner].push((char *) decoded_data, 2 * len);
    if (ret < 0) {
      fprintf(stderr, "from_cirbuf[%d].push overrun=%d\n", partner,
        ret / FRAGSIZE);
    }
    if (telephone_conference) {
      ret = conf_cirbuf[partner].push((char *) decoded_data, 2 * len);
      if (ret < 0) {
        /* fprintf(stderr, "conf_cirbuf[%d].push overrun=%d\n", 
           partner, ret/FRAGSIZE); */
      }
    }
  }
  return OKAY;
}

void command(char *cmd, int udp_fd)
{
  /* delete special characters like \r, \n */
  int i;
  for (i = 0; i < strlen(cmd); ++i) {
    if (cmd[i] < ' ') {         /* hack: assume ASCII coding */
      cmd[i] = 0;
      break;
    }
  }
  in_addr_t ip;
  switch (cmd[0]) {
  default:
    printf("voipconf commands:\n"
      "c IP-Adress           - connect to IP-Adress\n"
      "h IP-Adress           - hang-up IP-Adress\n\n");
    fflush(stdout);
    break;
  case 'c':
    ip = atoip(cmd + 2);
    for (i = 0; i < PARTNERS; ++i) {
      if (0 == to_ip[i]) {
        if (0 == to_partners) {
          mic_cirbuf.init();
        }
        tx_buf_init(i);
        to_ip[i] = ip;
        to_rtp[i].init(FORMAT_RTP);
        to_udp[i].send_init(cmd + 2, rtp_port, udp_fd);
        ++to_partners;
        break;
      }
    }
    break;
  case 'h':
    ip = atoip(cmd + 2);
    for (i = 0; i < PARTNERS; ++i) {
      if (ip == to_ip[i]) {
        to_ip[i] = 0;
        to_udp[i].send_close();
        --to_partners;

        print_gui("%s\n", cmd); /* Tcl/Tk needs \n */
        break;
      }
    }
    break;
  }

  /* fprintf(stderr, "cmd=%s to_partners=%d\n", cmd, to_partners); */
}

#define CMDLEN	80

int gui_read(int gui_fd, int udp_fd)
{
  char cmd[CMDLEN];

  int len = read(gui_fd, cmd, CMDLEN);

  if (len <= 0) {
    fprintf(stderr, "gui_read() close\n");
    int ret = shutdown(gui_fd, SHUT_RDWR);
    assert_errno(ret >= 0);

    return -1;
  }

  command(cmd, udp_fd);

  return gui_fd;
}

static int gui_fd = -1;

int print_gui(const char *fmt, ...)
{
/* in fmt: Formatstring as printf  */
/* in ...: Parameter(s) as printf  */

  if (gui_fd >= 0) {
    char s[MTU];
    va_list ap;
    va_start(ap, fmt);
    (void) vsnprintf(s, MTU, fmt, ap);
    va_end(ap);

    int len = strlen(s);

    return write(gui_fd, s, len);
  } else {
    return ERROR;
  }
}

struct timeval difftimeval(struct timeval time1, struct timeval time2)
{
  struct timeval diff;

  diff.tv_usec = time1.tv_usec - time2.tv_usec;
  if (diff.tv_usec < 0) {
    diff.tv_usec += 1000000;
    time2.tv_usec += 1;
  }
  diff.tv_sec = time1.tv_sec - time2.tv_sec;

  return diff;
}

float dB2q(float dB)
{
  /* Dezibel to Ratio */
  return powf(10.0f, dB / 20.0f);
}
float q2dB(float q)
{
  /* Ratio to Dezibel */
  return 20.0f * log10f(q);
}

/* program main loop. OS Event handler */
int loop(int audio_fd, int udp_fd, int gui_listen_fd)
{

  struct timeval timeout;
  fd_set read_fds;
  int max_fd = 64;              /* should be max(fd, ..) + 1 */
  static struct timeval last_partner_timeout;
  static struct timeval last_getambient;
  struct timezone tz;

  gettimeofday(&last_partner_timeout, &tz);
  gettimeofday(&last_getambient, &tz);
  for (;;) {
    timeout.tv_sec = 0;
    timeout.tv_usec = FRAGTIME * 1000;
    FD_ZERO(&read_fds);
    FD_SET(audio_fd, &read_fds);
    FD_SET(udp_fd, &read_fds);
    FD_SET(gui_listen_fd, &read_fds);
    if (gui_fd >= 0) {
      FD_SET(gui_fd, &read_fds);
    }
    int ret = select(max_fd, &read_fds, NULL, NULL, &timeout);
    assert_errno(ret >= 0);

    if (FD_ISSET(audio_fd, &read_fds)) {
      /* audio_write(audio_fd); */
      audio_read(audio_fd);
    }
    if (FD_ISSET(udp_fd, &read_fds)) {
      udp_read(udp_fd);
    }
    if (FD_ISSET(gui_listen_fd, &read_fds)) {
      gui_fd = tcp_server_init2(gui_listen_fd);
    }
    if (gui_fd >= 0) {
      if (FD_ISSET(gui_fd, &read_fds)) {
        gui_fd = gui_read(gui_fd, udp_fd);
      }
    }

    /* because of problems with Intel ICH5/Analog Devices AD1985 */
    audio_write(audio_fd);

    struct timeval now, diff;
    gettimeofday(&now, &tz);
    diff = difftimeval(now, last_partner_timeout);
    if (diff.tv_usec >= 160000) {       /* 2*PACKETDURATION in usec */
      last_partner_timeout = now;
      partner_timeout();
    }
    gettimeofday(&now, &tz);
    diff = difftimeval(now, last_getambient);
    if (diff.tv_sec >= 5) {
      last_getambient = now;
      if (to_partners > 0) {
        float ambient = aec.getambient();
        float ambientdB = q2dB(ambient / 32767.0f);
        fprintf(stderr, "Ambient = %2.0f dB\n", ambientdB);
      }
    }
  }
  return ERROR;
}

int main(int argc, char *argv[])
{

  int i;
  for (i = 1; i < argc && '-' == argv[i][0]; ++i) {
    switch (argv[i][1]) {
    case 'a':         /* set Ambient (No Talking) Noise level */
      aec.setambient(MAXPCM*dB2q(atof(argv[++i])));
      break;
    case 'l':         /* use hardware AEC and Line-in for microphone */
      channels = 2;
      break;
    case 'm':         /* use hardware AEC and Mic-in for microphone */
      channels = 2;
      break;
    case 'p':         /* RTP Portnumber (default is 5000) */
      rtp_port = atoi(argv[++i]);
      rtp_port &= 0xFFFE; /* RFC3550: RTP port has even port number */
      break;
    case 't':         /* Telephone conference call (true conference) */
      telephone_conference = 1;
      break;
    }
  }
  /* open Audio Transmit and Receive */
  int audio_fd = audio_init("/dev/dsp", channels);
  assert(audio_fd >= 0);

  /* open Network Receive */
  int udp_fd = UDP_recv_init(rtp_port);
  assert(udp_fd >= 0);

  /* open Graphical User Interface as TCP server */
  int gui_listen_fd = tcp_server_init(4999);

  /* iLBC codec Initialization */
  int j;
  for (j = 0; j < PARTNERS; ++j) {
    initEncode(Enc_Inst + j, ILBC_MODE);
    initDecode(Dec_Inst + j, ILBC_MODE, 1);
  }

  /* open Network Transmit Partner (Connections) */
  for (; i < argc; ++i) {
    if (0 == to_partners) {
      mic_cirbuf.init();
    }
    tx_buf_init(to_partners);
    to_ip[to_partners] = atoip(argv[i]);
    to_rtp[to_partners].init(FORMAT_RTP);
    to_udp[to_partners].send_init(argv[i], rtp_port, udp_fd);
    ++to_partners;
  }

  loop(audio_fd, udp_fd, gui_listen_fd);

  return OKAY;
}


/****************** APPENDIX aec.h ******************/

/* aec.h
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Acoustic Echo Cancellation NLMS-pw algorithm
 *
 * Version 1.3 filter created with www.dsptutor.freeuk.com
 */

#ifndef _AEC_H                  /* include only once */

// use double if your CPU does software-emulation of float
typedef float REAL; 	

/* dB Values */
const REAL M0dB = 1.0f;
const REAL M3dB = 0.71f;
const REAL M6dB = 0.50f;
const REAL M9dB = 0.35f;
const REAL M12dB = 0.25f;
const REAL M18dB = 0.125f;
const REAL M24dB = 0.063f;

/* dB values for 16bit PCM */
/* MxdB_PCM = 32767 * 10 ^(x / 20) */
const REAL M10dB_PCM = 10362.0f;
const REAL M20dB_PCM = 3277.0f;
const REAL M25dB_PCM = 1843.0f;
const REAL M30dB_PCM = 1026.0f;
const REAL M35dB_PCM = 583.0f;
const REAL M40dB_PCM = 328.0f;
const REAL M45dB_PCM = 184.0f;
const REAL M50dB_PCM = 104.0f;
const REAL M55dB_PCM = 58.0f;
const REAL M60dB_PCM = 33.0f;
const REAL M65dB_PCM = 18.0f;
const REAL M70dB_PCM = 10.0f;
const REAL M75dB_PCM = 6.0f;
const REAL M80dB_PCM = 3.0f;
const REAL M85dB_PCM = 2.0f;
const REAL M90dB_PCM = 1.0f;

const REAL MAXPCM = 32767.0f;

/* Design constants (Change to fine tune the algorithms */

/* The following values are for hardware AEC and studio quality 
 * microphone */

/* maximum NLMS filter length in taps. A longer filter length gives 
 * better Echo Cancellation, but slower convergence speed and
 * needs more CPU power (Order of NLMS is linear) */
#define NLMS_LEN  (80*8)

/* convergence speed. Range: >0 to <1 (0.2 to 0.7). Larger values give
 * more AEC in lower frequencies, but less AEC in higher frequencies. */
const REAL Stepsize = 0.7f;

/* minimum energy in xf. Range: M70dB_PCM to M50dB_PCM. Should be equal
 * to microphone ambient Noise level */
const REAL Min_xf = M75dB_PCM;

/* Double Talk Detector Speaker/Microphone Threshold. Range <=1
 * Large value (M0dB) is good for Single-Talk Echo cancellation, 
 * small value (M12dB) is good for Doulbe-Talk AEC */
const REAL GeigelThreshold = M6dB;

/* Double Talk Detector hangover in taps. Not relevant for Single-Talk 
 * AEC */
const int Thold = 30 * 8;

/* for Non Linear Processor. Range >0 to 1. Large value (M0dB) is good
 * for Double-Talk, small value (M12dB) is good for Single-Talk */
const REAL NLPAttenuation = M6dB;

/* Below this line there are no more design constants */


/* Exponential Smoothing or IIR Infinite Impulse Response Filter */
class IIR_HP {
  REAL x;

public:
   IIR_HP() { x = 0.0f; };
  REAL highpass(REAL in) {
    const REAL a0 = 0.01f;   /* controls Transfer Frequency */
    /* Highpass = Signal - Lowpass. Lowpass = Exponential Smoothing */
    x += a0 * (in - x);
    return in - x;
  };
};

/* 13 taps FIR Finite Impulse Response filter
 * Coefficients calculated with
 * www.dsptutor.freeuk.com/KaiserFilterDesign/KaiserFilterDesign.html
 */
class FIR_HP13 {
  REAL z[14];
  
public:
   FIR_HP13() { memset(this, 0, sizeof(FIR_HP13)); };
  REAL highpass(REAL in) {
    const REAL a[14] = {
      // Kaiser Window FIR Filter, Filter type: High pass
      // Passband: 300.0 - 4000.0 Hz, Order: 12
      // Transition band: 100.0 Hz, Stopband attenuation: 10.0 dB
      -0.043183226f, -0.046636667f, -0.049576525f, -0.051936015f, 
      -0.053661242f, -0.054712527f, 0.82598513f, -0.054712527f, 
      -0.053661242f, -0.051936015f, -0.049576525f, -0.046636667f, 
      -0.043183226f, 0.0f
    };
    memmove(z+1, z, 13*sizeof(REAL));
    z[0] = in;
    REAL sum0 = 0.0, sum1 = 0.0;
    int j;

    for (j = 0; j < 14; j+= 2) {
      // optimize: partial loop unrolling
      sum0 += a[j] * z[j];
      sum1 += a[j+1] * z[j+1];
    }
    return sum0+sum1;
  }
};

/* Recursive single pole IIR Infinite Impulse response filter
 * Coefficients calculated with
 * http://www.dsptutor.freeuk.com/IIRFilterDesign/IIRFiltDes102.html
 */
class IIR1 {
  REAL x, y;

public:
   IIR1() { memset(this, 0, sizeof(IIR1)); };
  REAL highpass(REAL in) {
    // Chebyshev IIR filter, Filter type: HP
    // Passband: 3700 - 4000.0 Hz
    // Passband ripple: 1.5 dB, Order: 1
    const REAL a0 = 0.105831884f;
    const REAL a1 = -0.105831884;
    const REAL b1 = 0.78833646f;
    REAL out = a0 * in + a1 * x + b1 * y;
    x = in;
    y = out;
    return out;
  }
};

/* Recursive two pole IIR Infinite Impulse Response filter
 * Coefficients calculated with
 * http://www.dsptutor.freeuk.com/IIRFilterDesign/IIRFiltDes102.html
 */
class IIR2 {
  REAL x[2], y[2];

public:
   IIR2() { memset(this, 0, sizeof(IIR2)); };
  REAL highpass(REAL in) {
    // Butterworth IIR filter, Filter type: HP
    // Passband: 2000 - 4000.0 Hz, Order: 2
    const REAL a[] = { 0.29289323f, -0.58578646f, 0.29289323f };
    const REAL b[] = { 1.3007072E-16f, 0.17157288f };
    REAL out =
      a[0] * in +
      a[1] * x[0] +
      a[2] * x[1] -
      b[0] * y[0] -
      b[1] * y[1];

    x[1] = x[0];
    x[0] = in;
    y[1] = y[0];
    y[0] = out;
    return out;
  }
};

// Extention in taps to reduce mem copies
#define NLMS_EXT  (10*8)  

// block size in taps to optimize DTD calculation 
#define DTD_LEN   16       


class AEC {
  // Time domain Filters
  IIR_HP hp00, hp1;             // DC-level remove Highpass)
  FIR_HP13 hp0;                 // 300Hz cut-off Highpass
  IIR1 Fx, Fe;                  // pre-whitening Highpass for x, e

  // Geigel DTD (Double Talk Detector)
  REAL max_max_x;               // max(|x[0]|, .. |x[L-1]|)
  int hangover;
  // optimize: less calculations for max()
  REAL max_x[NLMS_LEN / DTD_LEN];  
  int dtdCnt;
  int dtdNdx;

  // NLMS-pw
  REAL x[NLMS_LEN + NLMS_EXT];  // tap delayed loudspeaker signal
  REAL xf[NLMS_LEN + NLMS_EXT]; // pre-whitening tap delayed signal
  REAL w[NLMS_LEN];             // tap weights
  int j;                        // optimize: less memory copies
  int lastupdate;               // optimize: iterative dotp(x,x)
  double dotp_xf_xf;            // double to avoid loss of precision
  REAL s0avg;

public:
   AEC();

/* Geigel Double-Talk Detector
 *
 * in d: microphone sample (PCM as REALing point value)
 * in x: loudspeaker sample (PCM as REALing point value)
 * return: 0 for no talking, 1 for talking
 */
  int dtd(REAL d, REAL x);

/* Normalized Least Mean Square Algorithm pre-whitening (NLMS-pw)
 * The LMS algorithm was developed by Bernard Widrow
 * book: Widrow/Stearns, Adaptive Signal Processing, Prentice-Hall, 1985
 *
 * in mic: microphone sample (PCM as REALing point value)
 * in spk: loudspeaker sample (PCM as REALing point value)
 * in update: 0 for convolve only, 1 for convolve and update 
 * return: echo cancelled microphone sample
 */
  REAL nlms_pw(REAL mic, REAL spk, int update);

/* Acoustic Echo Cancellation and Suppression of one sample
 * in   d:  microphone signal with echo
 * in   x:  loudspeaker signal
 * return:  echo cancelled microphone signal
 */
  int AEC::doAEC(int d, int x);

  float AEC::getambient() {
    return s0avg;
  };
  void AEC::setambient(float Min_xf) {
    dotp_xf_xf = NLMS_LEN * Min_xf * Min_xf;
  };
};

#define _AEC_H
#endif


/****************** APPENDIX aec.cpp ******************/

/* aec.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Acoustic Echo Cancellation NLMS-pw algorithm
 *
 * Version 1.3 filter created with www.dsptutor.freeuk.com
 */

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
#include "aec.h"


/* Vector Dot Product */
REAL dotp(REAL a[], REAL b[]) {
  REAL sum0 = 0.0, sum1 = 0.0;
  int j;
  
  for (j = 0; j < NLMS_LEN; j+= 2) {
    // optimize: partial loop unrolling
    sum0 += a[j] * b[j];
    sum1 += a[j+1] * b[j+1];
  }
  return sum0+sum1;
}


AEC::AEC()
{
  max_max_x = 0.0f;
  hangover = 0;
  memset(max_x, 0, sizeof(max_x));
  dtdCnt = dtdNdx = 0;
  
  memset(x, 0, sizeof(x));
  memset(xf, 0, sizeof(xf));
  memset(w, 0, sizeof(w));
  j = NLMS_EXT;
  lastupdate = 0;
  s0avg = M80dB_PCM;
  setambient(Min_xf);
}

REAL AEC::nlms_pw(REAL mic, REAL spk, int update)
{
  REAL d = mic;      	        // desired signal
  x[j] = spk;
  xf[j] = Fx.highpass(spk);     // pre-whitening of x
  
  // calculate error value 
  // (mic signal - estimated mic signal from spk signal)
  REAL e = d - dotp(w, x + j);
  REAL ef = Fe.highpass(e);    // pre-whitening of e
  // optimize: iterative dotp(xf, xf)
  dotp_xf_xf += (xf[j]*xf[j] - xf[j+NLMS_LEN-1]*xf[j+NLMS_LEN-1]);
  if (update) {
    // calculate variable step size
    REAL mikro_ef = Stepsize * ef / dotp_xf_xf;

    // update tap weights (filter learning)
    int i;
    for (i = 0; i < NLMS_LEN; i += 2) {
      // optimize: partial loop unrolling
      w[i] += mikro_ef*xf[i+j];
      w[i+1] += mikro_ef*xf[i+j+1];
    }
  }

  if (--j < 0) {
    // optimize: decrease number of memory copies
    j = NLMS_EXT;
    memmove(x+j+1, x, (NLMS_LEN-1)*sizeof(REAL));    
    memmove(xf+j+1, xf, (NLMS_LEN-1)*sizeof(REAL));    
  }

  return e;
}


int AEC::dtd(REAL d, REAL x)
{
  // optimized implementation of max(|x[0]|, |x[1]|, .., |x[L-1]|):
  // calculate max of block (DTD_LEN values)
  x = fabsf(x);
  if (x > max_x[dtdNdx]) {
    max_x[dtdNdx] = x;
    if (x > max_max_x) {
      max_max_x = x;
    }
  }
  if (++dtdCnt >= DTD_LEN) {
    dtdCnt = 0;
    // calculate max of max
    max_max_x = 0.0f;
    for (int i = 0; i < NLMS_LEN/DTD_LEN; ++i) {
      if (max_x[i] > max_max_x) {
        max_max_x = max_x[i];
      }
    }
    // rotate Ndx
    if (++dtdNdx >= NLMS_LEN/DTD_LEN) dtdNdx = 0;
    max_x[dtdNdx] = 0.0f;
  }

  // The Geigel DTD algorithm with Hangover timer Thold
  if (fabsf(d) >= GeigelThreshold * max_max_x) {
    hangover = Thold;
  }
    
  if (hangover) --hangover;
  
  return (hangover > 0);
}


int AEC::doAEC(int d, int x) 
{
  REAL s0 = (REAL)d;
  REAL s1 = (REAL)x;
  
  // Mic Highpass Filter - to remove DC
  s0 = hp00.highpass(s0);
  
  // Mic Highpass Filter - telephone users are used to 300Hz cut-off
  s0 = hp0.highpass(s0);

  // ambient mic level estimation
  s0avg += 1e-4f*(fabsf(s0) - s0avg);
  
  // Spk Highpass Filter - to remove DC
  s1 = hp1.highpass(s1);

  // Double Talk Detector
  int update = !dtd(s0, s1);

  // Acoustic Echo Cancellation
  s0 = nlms_pw(s0, s1, update);

  // Acoustic Echo Suppression
  if (update) {
    // Non Linear Processor (NLP): attenuate low volumes
    s0 *= NLPAttenuation;
  }
  
  // Saturation
  if (s0 > MAXPCM) {
    return (int)MAXPCM;
  } else if (s0 < -MAXPCM) {
    return (int)-MAXPCM;
  } else {
    return (int)roundf(s0);
  }
}


/****************** APPENDIX cirbuf.h ******************/

/* cirbuf.h
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Circular Buffers
 *
 * Version 1.0
 */
#ifndef _CIRBUF_H

// must be multiple of FRAGSIZE and FRAMESIZE
#define CIRBUFSIZE	(2*80*8*2)      

/* circular buffer for FRAGSIZE to FRAMESIZE conversion with 
 * overrun/underrun */
class CIRBUF {
  char buf[CIRBUFSIZE];   // must be multiple of FRAGSIZE and FRAMESIZE
  int in;
  int out;
  int len;

public:
   CIRBUF();
  void CIRBUF::init();
  int CIRBUF::push(char *from, int size);
  int CIRBUF::pop(char *to, int size);
  int getlen() {
    return len;
}};

#define _CIRBUF_H
#endif


/****************** APPENDIX cirbuf.cpp ******************/

/* cirbuf.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Circular buffers
 *
 * Version 1.0
 */
 
#include <string.h>
#include <strings.h>

#include "oss.h"
#include "cirbuf.h"
#include "intercomd.h"

CIRBUF::CIRBUF()
{
  bzero(buf, CIRBUFSIZE);
  in = out = len = 0;
}

void CIRBUF::init()
{
  bzero(buf, CIRBUFSIZE);
  in = out = len = 0;
}

int CIRBUF::push(char *from, int size)
{
  memcpy(buf + in, from, size);
  in += size;
  if (in >= CIRBUFSIZE) {
    in -= CIRBUFSIZE;
  }
  len += size;
  if (len > CIRBUFSIZE) {
    int oversize = (((len - CIRBUFSIZE) / FRAGSIZE)) * FRAGSIZE;
    if (oversize < len - CIRBUFSIZE) {
      oversize += FRAGSIZE;
    }
    // delete oldest if overrun
    out += oversize;
    if (out >= CIRBUFSIZE) {
      out -= CIRBUFSIZE;
    }
    len -= oversize;
    return -oversize;
  } else {
    return OKAY;
  }
}

int CIRBUF::pop(char *to, int size)
{
  if (len < size) {
    // play out silence if underrun
    bzero(to, size);
    return ERROR;
  }
  memcpy(to, buf + out, size);
  out += size;
  if (out >= CIRBUFSIZE) {
    out -= CIRBUFSIZE;
  }
  len -= size;
  return OKAY;
}


/****************** APPENDIX oss.h ******************/

/* oss.h
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Open Sound System
 *
 * Version 1.0
 */
#ifndef _OSS_H

/* Design Constants */
#define AUDIOBUFS       2       // Soundcard buffers (minimum 2)
#define FRAMES	        4       // 1 to 4 (20ms to 80ms)
/* End of Design Constants */

#define FORMAT_OSS	AFMT_S16_LE

/* Using fragment sizes shorter than 256 bytes is not recommended as the
 * default mode of application
 */
#if FRAMES==4
#define FRAGSIZELD	6 // 8
#elif FRAMES==2
#define FRAGSIZELD	7
#else
#define FRAGSIZELD	6
#endif

#define FRAGSIZE		(1<<FRAGSIZELD)
#define FRAGTIME		(FRAGSIZE/16)

int audio_init(char *pathname, int channels_);

#define _OSS_H
#endif


/****************** APPENDIX oss.cpp ******************/

/* oss.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Open Sound System functions
 *
 * Version 1.0
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

/* Error handling */
#include <assert.h>
#include <errno.h>
extern int errno;

/* low level io */
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>

/* OSS, see www.4front-tech.com/pguide/index.html */
#include <sys/soundcard.h>

#include "oss.h"
#include "intercomd.h"

int audio_init(char *pathname, int channels_)
{
/* Using full duplex is simple in theory. The application just: 
 * Opens the device.
 * Turns on full duplex
 * Sets fragment size if necessary
 * Sets number of channels, sample format and sampling rate
 * Starts reading and writing the device
 */
  fprintf(stderr, "OSS Header SOUND_VERSION = %x\n", SOUND_VERSION);

  int audio_fd = open(pathname, O_RDWR);
  assert_errno(audio_fd >= 0);

  int sound_version = 0;
  ioctl(audio_fd, OSS_GETVERSION, &sound_version);
  fprintf(stderr, "OSS Driver SOUND_VERSION = %x\n", SOUND_VERSION);

  ioctl(audio_fd, SNDCTL_DSP_SETDUPLEX, 0);

  /* The 16 most significant bits (MMMM) determine maximum number of
   * fragments. By default the driver computes this based on available
   * buffer space. 
   * The minimum value is 2 and the maximum depends on the situation. 
   * Set MMMM=0x7fff if you don't want to limit the number of fragments   
   */
  // fragsize=2^FRAGSIZELD bytes 
  int frag = AUDIOBUFS << 16 | FRAGSIZELD;  
  if (2 == channels_) {
    ++frag;                     // double FRAGSIZE in stereo mode
  }
  int frag_ = frag;
  ioctl(audio_fd, SNDCTL_DSP_SETFRAGMENT, &frag);
  fprintf(stderr, "SETFRAGMENT=0x%x\n", frag);
  assert_errno(frag_ == frag);

  int format = FORMAT_OSS;
  ioctl(audio_fd, SNDCTL_DSP_SETFMT, &format);
  assert_errno(format == FORMAT_OSS);

  int channels = channels_;
  ioctl(audio_fd, SNDCTL_DSP_CHANNELS, &channels);
  assert_errno(channels_ == channels);
  fprintf(stderr, "SNDCTL_DSP_CHANNELS=%d\n", channels);

  int rate = 8000;
  ioctl(audio_fd, SNDCTL_DSP_SPEED, &rate);
  assert_errno(8000 == rate);
  fprintf(stderr, "SNDCTL_DSP_SPEED=%d\n", rate);

  fprintf(stderr, "\n");
  return audio_fd;
}


/****************** APPENDIX rtp.h ******************/

/* rtp.h
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * Real Time Protocol Version 2 (RFC3550)
 *
 * Version 1.0
 */

class RTP {
  /* Format in Host Byte order. Conversion with htonl() to Network Byte
   * order. This data structure is implementation dependent! 
   * Tested with GCC and x86 */

  unsigned long sequence:16;
  unsigned long payload_type:7;
  unsigned long marker:1;
  unsigned long csrc_count:4;
  unsigned long extension:1;
  unsigned long padding:1;
  unsigned long version:2;

  unsigned long timestamp;

  unsigned long ssrc;

public:
  RTP::RTP();
  void RTP::init(int payload_type_);
  void RTP::next(int frameduration);
  int RTP::check(int payload_type);
  unsigned long RTP::getssrc() {
    return ssrc;
  };
};

const unsigned PT_PCMU = 0;     // 8000 sample/second, G.711 u-Law
const unsigned PT_iLBC = 96;    // inofficial RTP payload type

char *RTP_network_copy(char *to, RTP * from);
char *RTP_host_copy(RTP * to, char *from);


/****************** APPENDIX tcp.h ******************/

/* tcp.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * TCP server functions for IPv4
 *
 * Version 1.0
 */

int tcp_server_init(int port);
int tcp_server_init2(int listen_fd);


/****************** APPENDIX tcp.cpp ******************/

/* tcp.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * TCP server functions for IPv4
 *
 * Version 1.0
 */


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

 /* Socket io */
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <assert.h>
#include <sys/types.h>
#include <netdb.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/errno.h>
#include <fcntl.h>

/* error handling */
#include <assert.h>

#include <signal.h>

#include "intercomd.h"

int tcp_server_init(int port)
/* open the server (listen) port - do this one time*/
{
  int fd = socket(PF_INET, SOCK_STREAM, 0);
  assert_errno(fd >= 0);

  struct sockaddr_in sock;
  memset((char *) &sock, 0, sizeof(sock));
  sock.sin_family = AF_INET;
  sock.sin_addr.s_addr = htonl(INADDR_ANY);
  sock.sin_port = htons(port);
  if (bind(fd, (struct sockaddr *) &sock, sizeof(sock)) < 0) {
    fprintf(stderr, "tcp_recv_init(): bind() failed\n");
    exit(2);
  }

  int ret = listen(fd, 1);
  assert_errno(ret >= 0);

  return fd;
}


int tcp_server_init2(int listen_fd)
/* open the communication (connection) - do this for every client */
{
  // fprintf(stderr, "tcp_server_init2\n");

  // avoid blocking accept() 
  int flags = fcntl(listen_fd, F_GETFL);
  int ret = fcntl(listen_fd, F_SETFL, flags | O_NONBLOCK);
  assert_errno(ret >= 0);

  struct sockaddr_in sock;
  socklen_t socklen = sizeof(sock);
  int fd = accept(listen_fd, (struct sockaddr *) &sock, &socklen);
  assert_errno(fd >= 0);

  return fd;
}


/****************** APPENDIX udp.h ******************/

/* udp.h
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * UDP functions for IPv4
 *
 * Version 1.0
 */

class UDP {
  int local_fd;
  struct sockaddr_in foreign_sock;
public:
  void send_init(char *foreign_name, int foreign_port, int local_fd_);
  void send(char *buf, int bytes);
  void send_close();
};

int UDP_recv_init(int port);

// IP to ASCII
char *iptoa(char *buf, in_addr_t ip);

// ASCII to IP
in_addr_t atoip(char *buf);


/****************** APPENDIX udp.cpp ******************/

/* udp.cpp
 *
 * Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
 *
 * UDP functions for IPv4
 * Multicast Doc: /usr/share/doc/howto/en/html/Multicast-HOWTO.html
 *
 * Version 1.0
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

 /* Socket io */
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <sys/time.h>
#include <unistd.h>
#include <sys/errno.h>
#include <assert.h>

#include "udp.h"
#include "intercomd.h"

int UDP_recv_init(int port)
/* starte UDP LAN Empfang. Programm-Exit bei Fehler! */
/* return: Filedescriptor fuer Empfang */
/* in port: UDP Port Nummer */
{
  int fd;
  struct sockaddr_in sock;
  int local_flag;
  socklen_t local_flagsize;
  int reuse_flag, reuse_len;

  /* init UDP local1 */
  fd = socket(AF_INET, SOCK_DGRAM, 0);
  if (fd < 0) {
    fprintf(stderr, "if_recv_init(): socket() failed\n");
    exit(1);
  }

  memset((char *) &sock, 0, sizeof(sock));
  sock.sin_family = AF_INET;
  sock.sin_addr.s_addr = htonl(INADDR_ANY);
  sock.sin_port = htons(port);
  if (bind(fd, (struct sockaddr *) &sock, sizeof(sock)) < 0) {
    fprintf(stderr, "if_recv_init(): bind() failed\n");
    exit(2);
  }
  local_flagsize = sizeof(int);
  if (getsockopt(fd, SOL_SOCKET, SO_RCVBUF, (char *) &local_flag,
      &local_flagsize) < 0) {
    fprintf(stderr, "if_recv_init(): getsockopt() failed\n");
    exit(3);
  }
  /* printf("SO_RCVBUF=%d\n", local_flag); */

  /* printf("init socket\n"); */
  return fd;
}

void UDP::send_init(char *foreign_name, int foreign_port, int local_fd_)
/* Starte Versand von UDP Meldungen auf LAN */
/* in foreign_name: IP-Adresse auf die gesendet wird */
/* in foreign_port: UDP-Port auf den gesendet wird */
{
  struct sockaddr_in local_sock;
  struct hostent *foreign_host;
  int local_flag;
  socklen_t local_flagsize;

  local_fd = local_fd_;

  /* turn on broadcast */
  local_flagsize = sizeof(int);
  if (getsockopt(local_fd, SOL_SOCKET, SO_BROADCAST,
      (char *) &local_flag, &local_flagsize) < 0) {
    fprintf(stderr, "udp_send_init(): getsockopt() failed\n");
    exit(3);
  }
  if (local_flag == 0) {
    local_flag = 1;
    setsockopt(local_fd, SOL_SOCKET, SO_BROADCAST, (char *) &local_flag,
      sizeof(int));
    local_flagsize = sizeof(int);
    if (getsockopt(local_fd, SOL_SOCKET, SO_BROADCAST,
        (char *) &local_flag, &local_flagsize) < 0) {
      fprintf(stderr, "udp_send_init() SO_BROADCAST failed\n");
      exit(3);
    }
  }

  /* init foreign part */
  memset((char *) &foreign_sock, 0, sizeof(foreign_sock));

  foreign_host = gethostbyname(foreign_name);
  if (foreign_host == NULL || foreign_host->h_length == 0) {
    fprintf(stderr, "udp_send_init(): gethostbyname() failed");
    exit(1);
  }
  memcpy(&foreign_sock.sin_addr.s_addr, foreign_host->h_addr_list[0],
    foreign_host->h_length);
  foreign_sock.sin_family = AF_INET;
  foreign_sock.sin_port = htons(foreign_port);


  in_addr_t to_ip = ntohl(foreign_sock.sin_addr.s_addr);
  char s[20];
  print_gui("c %s\n", iptoa(s, to_ip));
}

void UDP::send(char *buf, int bytes)
/* Sende UDP Meldung auf LAN */
/* in buf: Meldung */
/* in bytes: Laenge der Meldung */
{
  int len;

  len = sendto(local_fd,
    buf, bytes,
    0, (struct sockaddr *) &foreign_sock, sizeof(foreign_sock));
  if (len != bytes) {
    fprintf(stderr, "udp_send(): sendto() foreign failed ret=%d\n",
      len);
  }
}

void UDP::send_close()
{
  memset((char *) &foreign_sock, 0, sizeof(foreign_sock));
}

// IP to ASCII
char *iptoa(char *buf, in_addr_t ip)
{
  int i1 = ip >> 24;
  int i2 = (ip >> 16) & 0xFF;
  int i3 = (ip >> 8) & 0xFF;
  int i4 = ip & 0xFF;

  sprintf(buf, "%d.%d.%d.%d", i1, i2, i3, i4);

  return buf;
}

// ASCII to IP
in_addr_t atoip(char *buf)
{
  int i1, i2, i3, i4;

  sscanf(buf, "%d.%d.%d.%d", &i1, &i2, &i3, &i4);

  return i1 << 24 | i2 << 16 | i3 << 8 | i4;
}


/****************** APPENDIX intercom.tcl ******************/

#!/usr/bin/wish

# intercom.tcl
#
# Copyright (C) DFS Deutsche Flugsicherung (2004). All Rights Reserved.
#
# Voice-over-IP Intercom Graphical User Interface
#
# Version 1.1 with short/long keypress


proc t_color {t state} {
  switch $state {
  0 { .$t configure -foreground black         ;# nothing
      .$t configure -activeforeground black
      .$t configure -background "#d9d9d9"
      .$t configure -activebackground "#d9d9d9"}
  1 { .$t configure -foreground black         ;# transmit
      .$t configure -activeforeground black
      .$t configure -background yellow
      .$t configure -activebackground yellow}
  2 { .$t configure -foreground black         ;# receive
      .$t configure -activeforeground black
      .$t configure -background magenta
      .$t configure -activebackground magenta}
  3 { .$t configure -foreground black         ;# full duplex
      .$t configure -activeforeground black
      .$t configure -background green
      .$t configure -activebackground green}
  }
}

proc keyPress {t} {
  global tmap state sock mode tping
  
  set mode($t) 0
  set ip $tmap($t)
  switch $state($t) {
  0 {set cmd c
    after 300 [list set mode($t) 1]}
  1 {set cmd h}
  2 {set cmd c
    after 300 [list set mode($t) 1]}
  3 {set cmd h}
  }
  
  # puts "$cmd $ip"
  puts $sock "$cmd $ip"
  flush $sock
}

proc keyRelease {t} {
  global mode
  
  if {$mode($t)} {
    keyPress $t
  }
}

proc tx_begin {ip} {
  global ipmap state

  set t $ipmap($ip)
  puts "tx_begin $ip $t"
  switch $state($t) {
  0 {set state($t) 1}
  1 { }
  2 {set state($t) 3}
  3 { }
  }
  t_color $t $state($t)
}

proc rx_begin {ip} {
  global ipmap state

  set t $ipmap($ip)
  puts "rx_begin $ip $t"
  switch $state($t) {
  0 {set state($t) 2}
  1 {set state($t) 3}
  2 { }
  3 { }
  }
  t_color $t $state($t)
}

proc tx_end {ip} {
  global ipmap state

  set t $ipmap($ip)
  puts "tx_end $ip $t"
  switch $state($t) {
  0 {}
  1 {set state($t) 0}
  2 { }
  3 {set state($t) 2}
  }
  t_color $t $state($t)
}

proc rx_end {ip} {
  global ipmap state

  set t $ipmap($ip)
  puts "rx_end $ip $t"
  switch $state($t) {
  0 { }
  1 { }
  2 {set state($t) 0}
  3 {set state($t) 1}
  }
  t_color $t $state($t)
}

proc recv {} {
  global sock
  
  gets $sock cmd
  # puts $cmd
  set argv [split $cmd]
  # puts $argv
  set ip [lindex $argv 1]
  switch [lindex $argv 0] {
  c {tx_begin $ip}
  r {rx_begin $ip}
  h {tx_end $ip}
  d {rx_end $ip}
  }
}

# include GUI
source intercom.ui.tcl
intercom_ui .

proc guiconfig {t text ip} {
  global state mode ipmap tmap
  
  set state($t) 0
  set mode($t) 0
  
  set ipmap($ip) $t
  set tmap($t) $ip
  .$t configure -text $text -highlightthickness 12
  .$t configure -command [list keyRelease $t]
  bind .$t <ButtonPress-1> [list keyPress $t]
}

# include configuration
source intercom.config.tcl

# init TCP connection to intercomd
set sock [socket 127.0.0.1 4999]
fileevent $sock readable recv

set nodename [exec uname -n]
wm title . "intercom $nodename"


/****************** APPENDIX intercom.ui.tcl ******************/

#! /bin/sh
# the next line restarts using wish \
exec wish "$0" "$@"

# interface generated by SpecTcl version 1.1 from /home/anblf/voipconf/intercom.ui
#   root     is the parent window for this user interface

proc intercom_ui {root args} {

	# this treats "." as a special case

	if {$root == "."} {
	    set base ""
	} else {
	    set base $root
	}
    
	canvas $base.canvas#1 \
		-height 0 \
		-width 400

	canvas $base.canvas#2 \
		-height 160 \
		-width 0

	button $base.t1 \
		-text t1

	button $base.t2 \
		-text t2

	button $base.t3 \
		-text t3

	button $base.t4 \
		-text t4

	button $base.t5 \
		-text t5

	button $base.t6 \
		-text t6

	button $base.t7 \
		-text t7

	button $base.t8 \
		-text t8


	# Geometry management

	grid $base.canvas#1 -in $root	-row 1 -column 2  \
		-columnspan 4 \
		-sticky nesw
	grid $base.canvas#2 -in $root	-row 2 -column 1  \
		-rowspan 2 \
		-sticky nesw
	grid $base.t1 -in $root	-row 2 -column 2 
	grid $base.t2 -in $root	-row 2 -column 3 
	grid $base.t3 -in $root	-row 2 -column 4 
	grid $base.t4 -in $root	-row 2 -column 5 
	grid $base.t5 -in $root	-row 3 -column 2 
	grid $base.t6 -in $root	-row 3 -column 3 
	grid $base.t7 -in $root	-row 3 -column 4 
	grid $base.t8 -in $root	-row 3 -column 5 

	# Resize behavior management

	grid rowconfigure $root 1 -weight 0 -minsize 2
	grid rowconfigure $root 2 -weight 0 -minsize 30
	grid rowconfigure $root 3 -weight 0 -minsize 30
	grid columnconfigure $root 1 -weight 0 -minsize 2
	grid columnconfigure $root 2 -weight 0 -minsize 30
	grid columnconfigure $root 3 -weight 0 -minsize 30
	grid columnconfigure $root 4 -weight 0 -minsize 30
	grid columnconfigure $root 5 -weight 0 -minsize 30
# additional interface code
# end additional interface code

}


# Allow interface to be run "stand-alone" for testing

catch {
    if [info exists embed_args] {
	# we are running in the plugin
	intercom_ui .
    } else {
	# we are running in stand-alone mode
	if {$argv0 == [info script]} {
	    wm title . "Testing intercom_ui"
	    intercom_ui .
	}
    }
}