Sounds Contents.......................................................................................................... 1
Sounds............................................................................................................................. 1
DefineSound................................................................................................................... 2
Sound Styles.................................................................................................................. 3
Sound Envelope.............................................................................................................. 3
StartSound..................................................................................................................... 4
ADPCM Compression....................................................................................................... 4
ADPCM Sound Data........................................................................................................ 4
ADPCM Packets............................................................................................................. 5
ADPCM Code Data.......................................................................................................... 5
MP3 Compression............................................................................................................ 5
MP3 Sound Data............................................................................................................. 6
MP3 Frame..................................................................................................................... 6
Streaming Sound.............................................................................................................. 8
SoundStreamHead............................................................................................................ 8
SoundStreamHead2.......................................................................................................... 9
SoundStreamBlock......................................................................................................... 10
SDK Examples............................................................................................................... 10
The SWF file format defines a small and efficient sound model. SWF supports sample rates of 5.5, 11, 22 and 44 kHz in both stereo and mono. It is assumed that the playback platform can support rate conversion and multi-channel mixing of these sounds. The number of simultaneous channels supported depends on the CPU of specific platforms, but is typically three to eight channels.
There are two types of sounds in SWF:
1.
Event Sounds.
2.
Streaming Sounds.
Event sounds are played in response to some event such as a mouse-click, or when the player reaches a certain frame. Event sounds must be defined (downloaded) before they are used. They can be reused and mixed at runtime. Event sounds may also have a sound ‘style’ that modifies how the basic sound is played. Sound styles include looping, fades and level control. For more sophisticated control, there is a sound ‘envelope’, a simple transform that can generate various types of fades and level control, such as fade-in, fade-out and cross-fade.
Streaming sounds are downloaded and played in tight synchronization with the timeline. In this mode, sound packets are stored with each frame. If a CPU cannot render the frames of a movie as quickly as specified by the frame rate of the movie, the player slows down the animation rate instead of skipping frames. When streaming sound is present, the player skips frames in order to maintain tight synchronization with the sound track.
Note: A timeline can only have a single streaming sound playing at a time, but each movie clip, or sprite, can have it's own streaming sound.
There are three tags required to play an event sound:
1. The DefineSound tag defines the audio samples that make up an “event” sound. It also includes the sampling rate, sample size, and a stereo/mono flag.
2. The SOUNDINFO record defines the ‘styles’ that are applied to the event sound. Styles include fade-in, fade-out, synchronization and looping flags, and envelope control.
3. The StartSound tag instructs the player to begin playing the sound.
The DefineSound tag defines an event sound. It includes the sampling rate, size of each sample (8 or 16 bit), a stereo/mono flag, and an array of audio samples. The audio data may be stored in three ways:
1. As uncompressed raw samples.
2. Compressed using a modified ADPCM algorithm
3. Compressed using MP3 compression.
(See class FDTDefineSound in the Flash File Format SDK)
Define Sound |
||
Field |
Type |
Comment |
Header |
RECORDHEADER |
Tag ID = 14 |
SoundId |
UI16 |
ID for this character |
SoundFormat |
UB[4] 2 = sndCompressMP3 |
Format of SoundData |
SoundRate |
UB[2] |
The sampling rate |
SoundSize |
UB[1] |
Size of each sample |
SoundType |
UB[1] |
Mono or stereo sound |
SoundSampleCount |
UI32 |
Number of samples. |
SoundData |
UI8[size of sound data] |
The sound data – may be
uncompressed, ADPCM or MP3 |
Notes:
· The SoundId field uniquely identifies the sound so it can be played by StartSound.
The contents of SoundData vary depending on the value of the SoundFormat field in the SoundStreamHead tag:
· if SoundFormat is 0, SoundData contains raw, uncompressed samples.
· if SoundFormat is 1, SoundData contains an ADPCMSOUNDDATA record.
· If SoundFormat is 2, SoundData contains an MP3SOUNDDATA record.
The SOUNDINFO record modifies how an event sound is played. An event sound is defined with the DefineSound tag. Sound characteristics that can be modified include:
· Whether the sound loops (repeats) and how many times it loops.
· Simple fade in and fade out controls.
· A sound ‘envelope’ for more sophisticated control over levels.
(See class FSoundInfo in the Flash File Format SDK)
SOUNDINFO |
||
Field |
Type |
Comment |
SyncFlags |
UB[4] |
Sync flags Don't start the sound if
already playing. |
HasEnvelope |
HasEnvelope = UB[1] |
Has envelope information if
equal to 1 |
HasLoops |
HasLoops = UB[1] |
Has loop information if
equal to 1 |
HasOutPoint |
HasOutPoint = UB[1] |
Has out point information
if equal to 1 |
HasInPoint |
HasInPoint = UB[1] |
Has in point information if
equal to 1 |
InPoint |
if hasInPoint UI32 |
Sound in point value |
OutPoint |
if hasOutPoint UI32 |
Sound out point value |
LoopCount |
if hasLoops UI16 |
Sound loop count |
EnvelopeNumPoint |
if hasEnvelope nPoints =
UI8 |
Sound Envelope point count |
EnvelopeRecords |
if hasEnvelope SOUNDENVELOPE[nPoints] |
Sound Envelope records |
(See class FSndEnv in the Flash
File Format SDK)
SOUNDENVELOPE |
||
Field |
Type |
Comment |
Mark44 |
UI32 |
Mark44 information |
Level0 |
UI16 |
Level 0 information |
Level1 |
UI16 |
Level 1 information |
StartSound is a control tag that either starts (or stops) playing a sound defined by DefineSound. The SoundId field identifies which sound is to be played. The SOUNDINFO field defines how the sound is played. You can stop the sound playing by setting the synchStop flag in the SOUNDINFO record.
(See class FCTStartSound in the Flash File Format SDK)
StartSound |
||
Field |
Type |
Comment |
Header |
RECORDHEADER |
Tag ID = 15 |
SoundId |
UI16 |
ID of sound character to
play |
SOUNDINFO |
Sound style information |
ADPCM (Adaptive Differential Pulse Code Modulation) is a family of audio compression and decompression algorithms. It is a simple but efficient compression scheme that avoids any licensing or patent issues that arise with more sophisticated sound compression schemes, and helps to keep player implementations small.
ADPCM uses a modified Differential Pulse Code Modulation (DPCM) sampling technique where the encoding of a each sample is derived by calculating a 'difference' (DPCM) value, and applying to this a complex formula which includes the previous quantization value. The result is a compressed code, which can recreate almost the same subjective audio quality.
A common implementation takes 16-bit linear PCM samples and converts them to 4-bit codes, yielding a compression rate of 4:1. Public domain C code written by Jack Jansen is available at ftp://ftp.cwi.nl/pub/audio/adpcm.zip.
SWF extends Jansen’s implementation to support 2, 3, 4 and 5 bit ADPCM codes. When choosing a code size, there is the usual tradeoff between file-size and audio quality.
C++ source code to encode ADPCM sound data is available in the Flash File Format SDK.
(See class FDTDefineSoundADPCM in the Flash File Format SDK)
The ADPCMSOUNDATA record defines the size of the ADPCM codes used, and an array of ADPCMPACKETs which contain the ADPCM data.
ADPCMSOUNDDATA |
||
Field |
Type |
Comment |
AdpcmCodeSize |
UB[2] 0 = 2 bits/sample 1 = 3 bits/sample 2 = 4 bits/sample 3 = 5 bits/sample |
Bits per ADPCM code less
2. The actual size of each code is
AdpcmCodeSize + 2. |
AdpcmPackets |
ADPCMPACKET[one or more] |
Array of ADPCMPACKETs |
ADPCMPACKETs vary in structure depending on whether the sound is mono or stereo, 8-bit or 16-bit. A stereo sound with 16-bit samples is arranged as follows:
ADPCMPACKET16STEREO |
||
Field |
Type |
Comment |
InitialSampleLeft |
UI16 |
First sample for left
channel. Identical to first sample in
uncompressed sound. |
InitialIndexLeft |
UB[6] |
Initial index into the
ADPCM StepSizeTable* for left channel |
InitialSampleRight |
UI16 |
First sample for right
channel. Identical to first sample in uncompressed sound. |
InitialIndexRight |
UB[6] |
Initial index into the
ADPCM StepSizeTable* for right channel |
AdpcmCodeData |
UB[8192 *
(AdpcmCodeSize+2)] |
4096 ADPCM codes per channel – total 8192. Each sample is (AdpcmCodeSize + 2) bits. Channel data is interleaved left, then
right. |
* Refer to the ADPCM source code for an explanation of StepSizeTable.
If nBits = AdpcmCodeSize + 2, then AdpcmCodeData is arranged
as follows:
ADPCMCODEDATA |
||
Field |
Type |
Comment |
LeftAdpcmCode1 |
UB[nBits] |
Corresponds to 2nd
sample in uncompressed sound |
RightAdpcmCode1 |
UB[nBits] |
Corresponds to 2nd
sample in uncompressed sound |
LeftAdpcmCode2 |
UB[nBits] |
Corresponds to 3rd sample
in uncompressed sound |
RightAdpcmCode2 |
UB[nBits] |
Corresponds to 3rd sample
in uncompressed sound |
… |
… |
… |
… |
… |
… |
LeftAdpcmCode4096 |
UB[nBits] |
Corresponds to 4096th
sample in uncompressed sound |
RightAdpcmCode4096 |
UB[nBits] |
Corresponds to 4096th
sample in uncompressed sound |
MP3 is a sophisticated and complex audio compression algorithm. It produces superior audio quality, at better compression ratios than ADPCM. Generally speaking, “MP3” refers to MPEG1 Layer 3, however SWF supports later versions of MPEG (V2 and 2.5) that were designed to support lower bitrates.
A complete description of MP3 compression is beyond the scope of this document. For more information on MP3 see: http://www.mp3tech.org and http://www.iis.fhg.de/amm/techinf/layer3/index.html
The MP3SOUNDATA record contains a DelaySeek value, which tells the player when to play the sounds, represented by the MPEG audio frames that follow.
MP3SOUNDDATA |
||
Field |
Type |
Comment |
DelaySeek* |
UI16 |
Number of samples to seek forward
or delay. |
Mp3Frames |
MP3FRAME[zero or more] |
Array of MP3 Frames |
* The DelaySeek field has the following meaning:
· If this value is positive, the player seeks this number of samples into the sound block before the sound is played. However, the seek is only performed if the player reached the current frame via a GotoFrame action, otherwise no seek is performed.
· If this value is negative the player plays this number of silent samples before playing the sound block. The player behaves this way, regardless of how the current frame was reached.
DelaySeek is necessary because MP3 does not divide into an equal number of samples per each frame. The player needs to calculate what sounds belongs in what frame when a GotoFrame action is performed. The Flash application stops exporting silent data after 0.5 seconds of silence, so the player needs to play some silence before sounds starts coming in at a later frame. This is the silence that would have been there had the Flash application been exporting silence all along. (Note: only 0.5 seconds of silence is exported for MP3 compression)
The MP3FRAME record corresponds exactly to an MPEG audio frame that you would find in an MP3 music file. The first 32-bits of the frame contain header information, followed by an array of bytes which are the encoded audio samples. For more information on MPEG audio frames see: http://mp3tech.free.fr/programmers/frame_header.html
MP3FRAME |
||
Field |
Type |
Comment |
Delay |
UB[11] |
Frame sync. All bits must be set. |
MpegVersion |
UB[2] 0 = MPEG Version 2.5
1 = reserved 2 = MPEG Version 2 3 = MPEG Version 1 |
MPEG2.5 is an extension to MPEG2 which handles very low bitrates, allowing the use of lower sampling frequencies |
Layer |
UB[2] 0 = reserved |
Layer is always equal to 1
for MP3 headers in SWF files. The “3”
in MP3 refers to the Layer, not the MpegVersion. |
ProtectionBit |
UB[1] 0 = Protected by CRC 1 = Not protected |
If ProtectionBit == 0 a 16bit CRC follows the header |
Bitrate |
UB[4] Value MPEG1
MPEG2.x --------------------- 0
free free 1
32 8 2
40 16 3
48 24 4
56 32 5
64 40 6
80 48 7
96 56 8
112 64 9
128 80 10
160 96 11
192 112 12
224 128 13
256 144 14
320 160 15 bad bad |
Bitrates are in thousands of bits per second. For example, 128 means 128000 bps. |
SamplingRate |
UB[2] Value MPEG1
MPEG2 MPEG2.5 ------------------------- 0
44100 22050 11025 1
48000 24000 12000 2
32000 16000 8000 3
--
-- -- |
Sampling rate in Hz. |
PaddingBit |
UB[1] 0 = frame is not padded |
Padding is used to exactly fit the bitrate |
Reserved |
UB[1] |
|
ChannelMode |
UB[2] 0 = Stereo |
Dual channel files are made of two independent mono channels. Each one uses exactly half the bitrate of the file |
ModeExtension |
UB[2] |
|
Copyright |
UB[1] 0 = Audio is not copyrighted |
|
Original |
UB[1] 0 = Copy of original media |
|
Emphasis |
UB[2] 0 = none |
|
SampleData |
UB[size of sample data*] |
The encoded audio samples. |
* The size of the sample data is calculated like this (using integer arithmetic):
Size = (((MpegVersion == MPEG1 ? 144 : 72) * Bitrate) / SamplingRate) + PaddingBit - 4
For example: The size of the sample data for an MPEG1frame with a Bitrate of 128000, a SamplingRate of 44100, and PaddingBit of 1 is:
Size = (144 * 128000) / 44100 + 1 – 4
= 414 bytes
SWF supports a streaming sound mode where sound data is played and downloaded in tight synchronization with the timeline. In this mode, sound packets are stored with each frame. If a CPU cannot render the frames of a movie as quickly as specified by the frame rate of the movie, the player slows down the animation rate instead of skipping frames. When streaming sound is present, the player skips frames in order to maintain tight synchronization with the sound track.
Note: A timeline can only have a single streaming sound playing at a time, but each movie clip (or sprite) can have it's own streaming sound.
If a timeline contains streaming sound data, there must be a SoundStreamHead (or SoundStreamHead2) tag before the first sound data block (see SoundStreamBlock). The SoundStreamHead tag defines the data format of the sound data, the recommended playback format, and the average number of samples per SoundStreamBlock.
(See class FDTSoundStreamHead in the Flash File Format SDK)
SoundStreamHead |
||
Field |
Type |
Comment |
Header |
RECORDHEADER |
Tag ID = 18 |
Reserved |
UB[4] = 0 |
Always zero |
PlaybackSoundRate |
UB[2] = 0 |
Playback sampling rate. |
PlaybackSoundSize |
UB[1] = 1 0 = 8-bit 1 = 16-bit |
Playback sample size. |
PlaybackSoundType |
UB[1] |
Number of playback
channels; mono or stereo. |
StreamSoundCompression |
UB[4] = 1 0 = Uncompressed 1 = Compressed ADPCM 2 = Compressed MP3 |
Always 1 (ADPCM) |
StreamSoundRate |
UB[2] = 0 |
The sampling rate of the
streaming sound data. Always zero? |
StreamSoundSize |
UB[1] = 1 0 = 8-bit 1 = 16-bit |
The sample size of the
streaming sound data. Always 16 bit |
StreamSoundType |
UB[1] |
Number of channels in the
streaming sound data |
StreamSoundSampleCount |
UI16 |
Average number of samples
in each SoundStreamBlock |
Notes:
· The PlaybackSound* fields define how the player should play (mix) the streaming sound.
· The StreamSound* fields define the structure of the sound data in the StreamSoundBlocks.
The SoundStreamHead2 tag is identical to the SoundStreamHead tag, apart from allowing different values for StreamSoundCompression and StreamSoundSize.
(See class FDTSoundStreamHead2 in the Flash File Format SDK)
SoundStreamHead2 |
||
Field |
Type |
Comment |
Header |
RECORDHEADER |
Tag ID = 45 |
Reserved |
UB[4] = 0 |
Always zero |
PlaybackSoundRate |
UB[2] |
Playback sampling rate. |
PlaybackSoundSize |
UB[1] 0 = 8-bit 1 = 16-bit |
Playback sample size. |
PlaybackSoundType |
UB[1] |
Number of playback
channels; mono or stereo. |
StreamSoundCompression |
UB[4] 0 = Uncompressed 1 = Compressed ADPCM 2 = Compressed MP3 |
|
StreamSoundRate |
UB[2] |
The sampling rate of the
streaming sound data. |
StreamSoundSize |
UB[1] 0 = 8-bit 1 = 16-bit |
The sample size of the
streaming sound data. If
StreamSoundCompression = 0, 8 or 16 bit, otherwise always 16 bit |
StreamSoundType |
UB[1] |
Number of channels in the
streaming sound data |
StreamSoundSampleCount |
UI16 |
Average number of samples
in each SoundStreamBlock |
The SoundStreamBlock tag defines sound data that is interleaved with frame data so that sounds can be played as the movie is streamed over a network connection. The SoundStreamBlock tag must be preceded by a SoundStreamHead (or SoundStreamHead2) tag.
(See class FDTSoundStreamBlock in the Flash File Format SDK)
SoundStreamBlock |
||
Field |
Type |
Comment |
Header |
RECORDHEADER |
Tag ID = 19 |
StreamSoundData* |
UI8[size of compressed
data] |
Compressed sound data |
The contents of StreamSoundData vary depending on the value of the StreamSoundCompression field in the SoundStreamHead tag:
· If StreamSoundCompression is 0, StreamSoundData contains raw, uncompressed samples.
· if StreamSoundCompression is 1, StreamSoundData contains an ADPCMSOUNDDATA record.
· if StreamSoundCompression is 2, StreamSoundData contains an MP3STREAMSOUNDDATA record.
Field |
Type |
Comment |
SampleCount |
UI16 |
Number of samples
represented by this block. |
Mp3SoundData |
MP3 Frames with Delay/Seek.
|
The SDK contains C++ examples that demonstrate how to create sounds in SWF:
· FExampleSound.cpp uses the low-level manager to read a .wav file and write it to the SWF file as and ADPCM compressed sound. The class FDTDefineSoundWav reads the .wav file, and compresses it using 2, 3, 4 or 5-bit ADPCM compression.
· HFExampleSound.cpp uses the high-level manager to create two sounds. The first is created from a .wav file. The second is created from an array of sample data using the HFSound class. Both sounds are compressed with ADPCM compression.