Penguin
Annotated edit history of soxexam(1) version 1, including all changes. View license author blame.
Rev Author # Line
1 perry 1 SoX
2 !!!SoX
3 NAME
4 CONVERSIONS
5 EFFECTS
6 SEE ALSO
7 AUTHOR
8 ----
9 !!NAME
10
11
12 soxexam - SoX Examples (CHEAT SHEET)
13 !!CONVERSIONS
14
15
16 __Introduction__
17
18
19 In general, SoX will attempt to take an input sound file
20 format and convert it into a new file format using a similar
21 data type and sample rate. For instance,
22
23
24 If an output format doesn't support the same data type as
25 the input file then SoX will generally select a default data
26 type to save it in. You can override the default data type
27 selection by using command line options. This is also useful
28 for producing an output file with higher or lower precision
29 data and/or sample rate.
30
31
32 Most file formats that contain headers can automatically be
33 read in. When working with header-less file formats then a
34 user must manually tell SoX the data type and sample rate
35 using command line options.
36
37
38 When working with header-less files (raw files), you may
39 take advantage of the pseudo-file types of .ub, .uw, .sb,
40 .sw, .ul, and .sl. By using these extensions on your
41 filenames you will not have to specify the corresponding
42 options on the command line.
43
44
45 __Precision__
46
47
48 The following data types and formats can be represented by
49 their total uncompressed bit precision. When converting from
50 one data type to another care must be taken to insure it has
51 an equal or greater precision. If not then the audio quality
52 will be degraded. This is not always a bad thing when your
53 working with things such as voice audio and are concerned
54 about disk space or bandwidth of the audio
55 data.
56
57
58 Data Format Precision
59 ___________ _________
60 unsigned byte 8-bit
61 signed byte 8-bit
62 u-law 14-bit
63 A-law 13-bit
64 unsigned word 16-bit
65 signed word 16-bit
66 ADPCM 16-bit
67 GSM 16-bit
68 unsigned long 32-bit
69 signed long 32-bit
70 ___________ _________
71
72
73 __Examples__
74
75
76 Use the '-V' option on all your command lines. It makes SoX
77 print out its idea of what is going on. '-V' is your
78 friend.
79
80
81 To convert from unsigned bytes at 8000 Hz to signed words at
82 8000 Hz:
83
84
85 sox -r 8000 -c 1 filename.ub newfile.sw
86
87
88 To convert from Apple's AIFF format to Microsoft's WAV
89 format:
90
91
92 sox filename.aiff filename.wav
93
94
95 To convert from mono raw 8000 Hz 8-bit unsigned PCM data to
96 a WAV file:
97
98
99 sox -r 8000 -u -b -c 1 filename.raw
100 filename.wav
101
102
103 SoX may even be used to convert sample rates. Downconverting
104 will reduce the bandwidth of a sample, but will reduce
105 storage space on your disk. All such conversions are lossy
106 and will introduce some noise. You should really pass your
107 sample through a low pass filter prior to downconverting as
108 this will prevent alias signals (which would sound like
109 additional noise). For example to convert from a sample
110 recorded at 11025 Hz to a u-law file at 8000 Hz sample
111 rate:
112
113
114 sox infile.wav -t au -r 8000 -U -b -c 1
115 outputfile.au
116
117
118 To add a low-pass filter (note use of stdout for output of
119 the first stage and stdin for input on the second
120 stage):
121
122
123 sox infile.wav -t raw -s -w -c 1 - lowpass 3700 | sox -t raw
124 -r 11025 -s -w -c 1 - -t au -r 8000 -U -b -c 1
125 ofile.au
126
127
128 If you hear some clicks and pops when converting to u-law or
129 A-law, reduce the output level slightly, for example this
130 will decrease it by 20%:
131
132
133 sox infile.wav -t au -r 8000 -U -b -c 1 -v .8
134 outputfile.au
135
136
137 ''SoX'' is great to use along with other command line
138 programs by passing data between the programs using
139 pipelines. The most common example is to use mpg123 to
140 convert mp3 files in to wav files. The following command
141 line will do this:
142
143
144 mpg123 -b 10000 -s filename.mp3 | sox -t raw -r 44100 -s -w
145 -c 2 - filename.wav
146
147
148 When working with totally unknown audio data then the
149
150
151 sox -V -t auto filename.snd filename.wav
152
153
154 It is important to understand how the internals of
155 ''SoX'' work with compressed audio including u-law,
156 A-law, ADPCM, or GSM. ''SoX'' takes ALL input data types
157 and converts them to uncompressed 32-bit signed data. It
158 will then convert this internal version into the requested
159 output format. This means additional noise can be introduced
160 from decompressing data and then recompressing. If applying
161 multiple effects to audio data, it is best to save the
162 intermediate data as PCM data. After the final effect is
163 performed, then you can specify it as a compressed output
164 format. This will keep noise introduction to a
165 minimum.
166
167
168 The following example applies various effects to an 8000 Hz
169 ADPCM input file and then end up with the final file as
170 44100 Hz ADPCM.
171
172
173 sox firstfile.wav -r 44100 -s -w secondfile.wav
174 sox secondfile.wav thirdfile.wav swap
175 sox thirdfile.wav -a -b finalfile.wav mask
176
177
178 Under a DOS shell, you can convert several audio files to an
179 new output format using something similar to the following
180 command line:
181
182
183 FOR %X IN (*.RAW) DO sox -r 11025 -w -s -t raw $X
184 $X.wav
185 !!EFFECTS
186
187
188 Special thanks goes to Juergen Mueller
189 (jmeuller@uia.au.ac.be) for this write up on
190 effects.
191
192
193 __Introduction:__
194
195
196 The core problem is that you need some experience in using
197 effects in order to say
198
199
200 Here are some examples which can be used with any music
201 sample. (For a sample where only a single instrument is
202 playing, extreme parameter setting may make well-known
203
204
205 Single effects will be explained and some given parameter
206 settings that can be used to understand the theory by
207 listening to the sound file with the added
208 effect.
209
210
211 Using multiple effects in parallel or in series can result
212 either in a very nice sound or (mostly) in a dramatic
213 overloading in variations of sounds such that your ear may
214 follow the sound but you will feel unsatisfied. Hence, for
215 the first time using effects try to compose them as
216 minimally as possible. We don't regard the composition of
217 effects in the examples because too many combinations are
218 possible and you really need a very fast machine and a lot
219 of memory to play them in real-time.
220
221
222 However, real-time playing of sounds will greatly speed up
223 learning and/or tuning the parameter settings for your
224 sounds in order to get that
225
226
227 Basically, we will use the
228
229
230 For easy listening of file.xxx (
231
232
233 play file.xxx effect-name effect-parameters
234
235
236 Or more SoX-like (for
237
238
239 sox file.xxx -t ossdsp -w -s /dev/dsp effect-name
240 effect-parameters
241
242
243 or (for
244
245
246 sox file.xxx -t sunau -w -s /dev/audio effect-name
247 effect-parameters
248
249
250 And for date freaks:
251
252
253 sox file.xxx file.yyy effect-name
254 effect-parameters
255
256
257 Additional options can be used. However, in this case, for
258 real-time playing you'll need a very fast
259 machine.
260
261
262 Notes:
263
264
265 I played all examples in real-time on a Pentium 100 with 32
266 MB and Linux 2.0.30 using a self-recorded sample ( 3:15 min
267 long in
268
269
270 Effects:
271
272
273 __Echo__
274
275
276 An echo effect can be naturally found in the mountains,
277 standing somewhere on a mountain and shouting a single word
278 will result in one or more repetitions of the word (if not,
279 turn a bit around and try again, or climb to the next
280 mountain).
281
282
283 However, the time difference between shouting and repeating
284 is the delay (time), its loudness is the decay. Multiple
285 echos can have different delays and decays.
286
287
288 It is very popular to use echos to play an instrument with
289 itself together, like some guitar players (Brain May from
290 Queen) or vocalists are doing. For music samples of more
291 than one instrument, echo can be used to add a second sample
292 shortly after the original one.
293
294
295 This will sound as if you are doubling the number of
296 instruments playing in the same sample:
297
298
299 play file.xxx echo 0.8 0.88 60.0 0.4
300
301
302 If the delay is very short, then it sound like a (metallic)
303 robot playing music:
304
305
306 play file.xxx echo 0.8 0.88 6.0 0.4
307
308
309 Longer delay will sound like an open air concert in the
310 mountains:
311
312
313 play file.xxx echo 0.8 0.9 1000.0 0.3
314
315
316 One mountain more, and:
317
318
319 play file.xxx echo 0.8 0.9 1000.0 0.3 1800.0
320 0.25
321
322
323 __Echos__
324
325
326 Like the echo effect, echos stand for
327
328
329 The sample will be bounced twice in symmetric
330 echos:
331
332
333 play file.xxx echos 0.8 0.7 700.0 0.25 700.0
334 0.3
335
336
337 The sample will be bounced twice in asymmetric
338 echos:
339
340
341 play file.xxx echos 0.8 0.7 700.0 0.25 900.0
342 0.3
343
344
345 The sample will sound as if played in a garage:
346
347
348 play file.xxx echos 0.8 0.7 40.0 0.25 63.0 0.3
349
350
351 __Chorus__
352
353
354 The chorus effect has its name because it will often be used
355 to make a single vocal sound like a chorus. But it can be
356 applied to other instrument samples too.
357
358
359 It works like the echo effect with a short delay, but the
360 delay isn't constant. The delay is varied using a sinusoidal
361 or triangular modulation. The modulation depth defines the
362 range the modulated delay is played before or after the
363 delay. Hence the delayed sound will sound slower or faster,
364 that is the delayed sound tuned around the original one,
365 like in a chorus where some vocals are a bit out of
366 tune.
367
368
369 The typical delay is around 40ms to 60ms, the speed of the
370 modulation is best near 0.25Hz and the modulation depth
371 around 2ms.
372
373
374 A single delay will make the sample more
375 overloaded:
376
377
378 play file.xxx chorus 0.7 0.9 55.0 0.4 0.25 2.0
379 -t
380
381
382 Two delays of the original samples sound like
383 this:
384
385
386 play file.xxx chorus 0.6 0.9 50.0 0.4 0.25 2.0 -t 60.0 0.32
387 0.4 1.3 -s
388
389
390 A big chorus of the sample is (three additional
391 samples):
392
393
394 play file.xxx chorus 0.5 0.9 50.0 0.4 0.25 2.0 -t 60.0 0.32
395 0.4 2.3 -t 40.0 0.3 0.3 1.3 -s
396
397
398 __Flanger__
399
400
401 The flanger effect is like the chorus effect, but the delay
402 varies between 0ms and maximal 5ms. It sound like wind
403 blowing, sometimes faster or slower including changes of the
404 speed.
405
406
407 The flanger effect is widely used in funk and soul music,
408 where the guitar sound varies frequently slow or a bit
409 faster.
410
411
412 The typical delay is around 3ms to 5ms, the speed of the
413 modulation is best near 0.5Hz.
414
415
416 Now, let's groove the sample:
417
418
419 play file.xxx flanger 0.6 0.87 3.0 0.9 0.5 -s
420
421
422 listen carefully between the difference of sinusoidal and
423 triangular modulation:
424
425
426 play file.xxx flanger 0.6 0.87 3.0 0.9 0.5 -t
427
428
429 If the decay is a bit lower, than the effect sounds more
430 popular:
431
432
433 play file.xxx flanger 0.8 0.88 3.0 0.4 0.5 -t
434
435
436 The drunken loudspeaker system:
437
438
439 play file.xxx flanger 0.9 0.9 4.0 0.23 1.3 -s
440
441
442 __Reverb__
443
444
445 The reverb effect is often used in audience hall which are
446 to small or contain too many many visitors which disturb
447 (dampen) the reflection of sound at the walls. Reverb will
448 make the sound be perceived as if it were in a large hall.
449 You can try the reverb effect in your bathroom or garage or
450 sport halls by shouting loud some words. You'll hear the
451 words reflected from the walls.
452
453
454 The biggest problem in using the reverb effect is the
455 correct setting of the (wall) delays such that the sound is
456 realistic and doesn't sound like music playing in a tin can
457 or has overloaded feedback which destroys any illusion of
458 playing in a big hall. To help you obtain realistic reverb
459 effects, you should decide first how long the reverb should
460 take place until it is not loud enough to be registered by
461 your ears. This is be done by varying the reverb time
462
463
464 Since audience halls do have a lot of walls, we will start
465 designing one beginning with one wall:
466
467
468 play file.xxx reverb 1.0 600.0 180.0
469
470
471 One wall more:
472
473
474 play file.xxx reverb 1.0 600.0 180.0 200.0
475
476
477 Next two walls:
478
479
480 play file.xxx reverb 1.0 600.0 180.0 200.0 220.0
481 240.0
482
483
484 Now, why not a futuristic hall with six walls:
485
486
487 play file.xxx reverb 1.0 600.0 180.0 200.0 220.0 240.0 280.0
488 300.0
489
490
491 If you run out of machine power or memory, then stop as many
492 applications as possible (every interrupt will consume a lot
493 of CPU time which for bigger halls is absolutely
494 necessary).
495
496
497 __Phaser__
498
499
500 The phaser effect is like the flanger effect, but it uses a
501 reverb instead of an echo and does phase shifting. You'll
502 hear the difference in the examples comparing both effects
503 (simply change the effect name). The delay modulation can be
504 sinusoidal or triangular, preferable is the later for
505 multiple instruments. For single instrument sounds, the
506 sinusoidal phaser effect will give a sharper phasing effect.
507 The decay shouldn't be to close to 1.0 which will cause
508 dramatic feedback. A good range is about 0.5 to 0.1 for the
509 decay.
510
511
512 We will take a parameter setting as for the flanger before
513 (gain-out is lower since feedback can raise the output
514 dramatically):
515
516
517 play file.xxx phaser 0.8 0.74 3.0 0.4 0.5 -t
518
519
520 The drunken loudspeaker system (now less
521 alcohol):
522
523
524 play file.xxx phaser 0.9 0.85 4.0 0.23 1.3 -s
525
526
527 A popular sound of the sample is as follows:
528
529
530 play file.xxx phaser 0.89 0.85 1.0 0.24 2.0 -t
531
532
533 The sample sounds if ten springs are in your
534 ears:
535
536
537 play file.xxx phaser 0.6 0.66 3.0 0.6 2.0 -t
538
539
540 __Compander__
541
542
543 The compander effect allows the dynamic range of a signal to
544 be compressed or expanded. For most situations, the attack
545 time (response to the music getting louder) should be
546 shorter than the decay time because our ears are more
547 sensitive to suddenly loud music than to suddenly soft
548 music.
549
550
551 For example, suppose you are listening to Strauss'
552
553
554 play file.xxx compand 0.3,1 -90,-90,-70,-70,-60,-20,0,0 -5 0
555 0.2
556
557
558 The transfer function (
559 very'' soft sounds between -90 and -70 decibels (-90 is
560 about the limit of 16-bit encoding) will remain unchanged.
561 That keeps the compander from boosting the volume on
562 ''
563
564
565 __Changing the Rate of Playback__
566
567
568 You can use stretch to change the rate of playback of an
569 audio sample while preserving the pitch. For example to play
570 at 1/2 the speed:
571
572
573 play file.wav stretch 2
574
575
576 To play a file at twice the speed:
577
578
579 play file.wav stretch .5
580
581
582 Other related options are
583
584
585 play file.wav speed 2
586
587
588 To raise the pitch of a sample 1 while note (100
589 cents):
590
591
592 play file.wav pitch 100
593
594
595 __Other effects (copy, rate, avg, stat, vibro, lowp, highp,
596 band, reverb)__
597
598
599 The other effects are simple to use. However, an
600
601
602 __More effects (to do !)__
603
604
605 There are a lot of effects around like noise gates,
606 compressors, waw-waw, stereo effects and so on. They should
607 be implemented, making SoX more useful in sound mixing
608 techniques coming together with a great variety of different
609 sound effects.
610
611
612 Combining effects by using them in parallel or serially on
613 different channels needs some easy mechanism which is stable
614 for use in real-time.
615
616
617 Really missing are the the changing of the parameters and
618 starting/stopping of effects while playing samples in
619 real-time!
620
621
622 Good luck and have fun with all the effects!
623
624
625 Juergen Mueller (jmueller@uia.ua.ac.be)
626 !!SEE ALSO
627
628
629 sox(1), play(1), rec(1)
630 !!AUTHOR
631
632
633 Juergen Mueller (jmueller@uia.ua.ac.be)
634
635
636 Updates by Anonymous.
637 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.