It may so happen that all participants in a conference do not have the connection of same bandwidth. So how do they take part simultaneously?
One solution is that all of them use a lower bandwidth. But this leads to reduced-quality audio encoding.
A smarter solution exists in the use of a RTP-level relay called a mixer. A mixer may be placed near the low-bandwidth area. This mixer resynchronizes incoming audio packets to reconstruct the constant 20 ms spacing generated by the sender, mixes these reconstructed audio streams into a single stream, translates the audio encoding to a lower-bandwidth one and forwards the lower-bandwidth packet stream across the low-speed link. The following figure gives a graphical representation -
The mixer puts its own identification as the source (SSRC) of the packet and puts the contributing sources in CSRC fields. If you don't know about SSRC and CSRC, come back to this paragraph after going through the RTP header structure.
Mixers have other uses too. An example is a video mixer that scales the images of individual people in separate video streams and composites them into one video stream to simulate a group scene.