I have started a project to set up a consistent tool set and methodology to create organ samples. Here is a collection of some tips (findings) I'd like to share (and confirm), and also have some questions to whoever could answer...
(Sorry for the length of the post: I'll try to improve/split it later)
(Parts of it probably need further clarification. My aim was not to make the “perfect rookie tutorial”. Please just ask questions...)
The scope, here, is:
- record the organ
- process the samples (the main job...)
- making a package (at least for testing purpose)
Out of scope: virtual consoles, sampler (runtime) software, making "extensions"/customizations...
I'll try to update this post with new findings, and provide the dedicated software tools as Python scripts.
B) Recording the organ (incl. Hardware)
C) Samples processing 1 : generic statements (incl. Software tools)
E) Samples processing 2 : noise reduction
D) Samples processing 3 : extraction notes from recordings
F) Samples processing 4 : setting loops and release points
G) Samples processing 5 : (re-)tuning, reverb
H) Creation of the package
The deliverable is the file set (samples, documentation and definition files) to be used with a "virtual organ software". Such packages can be purchased, but are usually VERY expensive, and there is an ethical (and maybe legal) problem letting commercial companies “own” sounds or getting profit from the work of past organ builders (this would be a separate topic, right?). The aim here is to help anyone who would like to record an organ to “save” its current sound performance, and to share it. As easily as possible.
The best would be to stay as generic as possible. However, the "convergence" platform seems to be the Myorgan/Ourgan/Grand_orgue/HW1 format. (Ahem... the name is "Grand_orgue" *this* week...)
-> If you do not know this software, please consult the other posts in this forum. Minimum knowledge of this software, and what is does, is mandatory.
A word about the sampling frequency and sample length that can be skipped:
[Finding] "Grand_orgue" (free/opensource) is today the best testing/sharing platform/format
[Finding] "Grand_orgue" only accepts 44.1kHz/16bits samples
[Question] is there a plan to have "Grand_orgue" running on 24bits samples? On 96kHz?
- Going higher in frequency is useless for the human ear, but allows to keep some “spare quality” to do post-processing.
- 44.1kHz is OK.
- 96K costs 2.2 times more memory than 44.1kHz (the Compact Disk standard). Memory is expensive. Recording in 24 bits (3 bytes) gives a better definition than 16 bits (2 bytes). - Using 24 bits resolution without a top-class audio system (amplifier/loudspeakers/headset) is a waste of resources. Using 24 bits on a PC-class audio system is a nonsense. Using 24 bits costs 1.5 times more memory than 16 bits. 24bits/96k costs 3.3 times more memory than 16bits/44.1kHz. 16bits/44.1kHz is a quality standard way above any “mp3” encoding.
[Finding] On the computer, a 32 bits Windows Operating Systems can handle only 3 GB of usable memory. Top quality memory requirements for a 30-40 stops organ in 24bits/96k will typically use 10 GB of RAM... way above the possibility (in 2010) of a non-professional. (8 GB is costly but can be done.)
[Conclusion] We will design what follows to create a sample set ready to be used with "Grand_orgue". So, for the moment, we will limit to 44.1kHz/16bits.
B) Recording the organ
We are looking for :
- a process
- a microphone pair
- a recording device
- recording level specification
- a way to monitor
Target: Record all the notes of all the stops in the same conditions, with as less noise as possible, and trying to make it easy for the next steps.
[Note] This does not mean "recording all the pipes individually" (a 4 ranks Fourniture will only have 1 sample per note, not 4).
[Finding] Each note record (aka “sample”) shall have a 3-5s sustain. (To enable to detect loops : <3s is not enough, >5s leads to useless resource consumption (RAM, disk space, CPU load)). Let's go for 4s. Bass pipes may need more, especially reeds.
[Finding] After the release, the recording of the note shall go on until complete exhaust of the reverberation. This depends on the building. 4s should be a minimum. More would be needed in large buildings.
[Finding] During the recording, it would be very cumbersome to create 1 file per note. One file per stop (including the all the notes well separated) is more convenient.
[Finding] Obviously, don't use "mp3" recording (or anything “compressed”). Output shall be uncompressed at this level, aka “wav” file format.
[Finding] Reverberation can be added later. But not easily removed. Recording as “dry” as possible is the best.
[Finding] Swell stops shall be recorded with the expression open.
[Finding] Recording in a chromatic order (C,C#,D,D",etc...) is very difficult to handle later. And if the temperature change, the frequency (especially for the flute pipes) will change significantly. Recording in a chromatic order with temperature change would lead to a poor tuning of basses vs treble.
Proposal is to record all the 5 Cs, then the C#s, then the Ds, etc... up to the 4 Bs (C,c0,c1,c2,c3,C#,c#0,...b1,b2).
[Finding] Very important! Record "silence" (i.e. background noise=mainly blower noise), before and after the stop session. (Efficient noise reduction algorithms all need the "signature" of the background noise.) Obviously, leave the blower on when recording “silence”. (And, if Glenn Gould was playing, just tell him to keep humming.)
[Conclusion] So, we will record 1 wav file for each stop. Keep the note on 4s. Space the notes by 4s at least. A typical 54 notes stop will be 8-9 min long. The wile will be 80k to 100k long (44.1Hz *2channels *2bytes *60s *8min+some silence), so it's easy to handle.
[Suggestion] A recording "roadmap" is very relevant, deciding the order, the target levels (see later) and recalling all the steps to do. It does not look like a tough job, but it is surprising how mistakes can be done "in action"...
B2) Microphone specification
Target: select the microphones.
[Finding] We shall record in stereo. Mono gives poor results. We'll have to use a "matched" pair of mics. Never use 2 different mics.
[Finding] The place where the mics are set is the most important parameter. No rule can be given. Just guess and try. To get a "dry" recording, the mics shall be not far away from the pipes. Forget anything that reads “ambiance” or “surround”. Out goal is NOT to record the reverberation. (It will be a POOR recording according to the artistic standards...)
[Finding] The coverage of the mics shall be very consistent, and not change with the frequency. (Bass and treble organ pipes can be found anywhere in the span.) A 90° coverage is OK if the mics-to-case distance is equal to ½ of the width of the case. This will automatically lead to a “cardioid” pattern. Forget anything that reads “omni”, “hyper-cardioid” or “ambiance”. Do not use more than 2 mics (a pair) : we are not doing a record to please the ear, but to capture the pipes sound.
[Finding] The transitory (“attack”) phase is *very* important for the organ pipes sound. The mics shall have a good transitory response.
[Finding] Maximum loudness is not very high. >110 dB SPL is enough.
[Finding] Frequency response : 20Hz-20kHz is the usual range. If the sampling is done at 44.1Hz, the Shannon theorem tells that the highest sampled frequency will be 22.05 kHz. Better mic performance is useless. For sampling rates up to 96kHz, the question for the frequency response of the mics should be studied again.
[Finding] If the mics are placed between 5 and 10 m from the front pipes (½ of the case width), a sensitivity of 10mV/Pa is OK.
[Finding] An organ is noisy when not played! (Blower noise.) Noise performance for the mic is here not an issue. Something better than 23 dB(A) is OK : anyway, we will have to do digital noise reduction.
[Conclusion] Condenser (static) mics, with a small (1/2") membrane are the best. Large membranes are no good here (because of poor recording pattern consistency when frequency changes, and we do not need sensitivity and can handle with a poor signal/noise ratio.)
=> A pair of Oktava MK-012, or Rode NT5 are OK.
e.g. Rode NT5: Cardioid, 1/2" membrane, 20Hz-20kHZ response , 38dB(A) signal/noise, 145dB SPL. (“A” means 'weighted to the performance of the human ear')
All those mics need a phantom alimentation (48V) (they are "static"). Connectivity is XLR. (We are not looking for top-quality at >1kEuro, but, obviously, forget anything ending with a minijack...) Your preamp or interface shall handle that.
B3) Recoding device
Target: select the equipment in which to connect the mics.
No real “finding” here, all that follows is only my guesses.
An USB computer interface, with 2 XLR inputs and providing phantom alimentation is OK.
Additional requirements may include: sampling to to 96kHz in 24 bits. USB 2.0. Check driver availability for 64 bit operating systems even if you do not use them, for future evolution. The “virtual organ world” definitively needs 64 bits operating systems.
-> e.g. TASCAM US-122MK2 (all previous requs + no-latency monitoring + it's a wonderful midi/sound interface to operate a virtual organ... I have tried so many of them, including "firewire" bullshit (completely irrelevant here). This one isway my preferred.)
-> Recording software: Audacity is free.
-> PC requirements: Intel Core or equivalent laptop, 2 GB RAM, at least 20 GB disk space. USB 2.0 interface. Use the network adapter and check the availability of a network plug on the recording place. A laptop usually cannot handle a long recording session with the audio interface and the screen running all the time.
I do not know the “Mac” world.
If you are not VERY comfortable with computers (or you have a VERY efficient and kind support), do not listen to anyone suggesting a “linux” type operating system. They are definitively my preferred (opensource), but I would not advice them.
OR: use a digital “mini-recorder” (e.g Zoom H4 with “real” additional mics).
Minidiscs: no. We have reached a new century, now...
DATs sample at 48k : we'd have to re-sample for less than 4k quality, and loose way more than the gain.
B4) Recording level
The most difficult point so far (and the less documented I think). We shall have enough "dynamic" to extract the signal from the blower noise, even for the less noisy stops ("Nachthorn")... But obviously any single note shall not come out at 0dB, otherwise, the virtual organ will not be able to “mix” it with another note (from another stop or another voice in the polyphony)... Our stops are not designed to be played alone (another difference with a “typical” recording).
[Think about it] Just for fun: imagine you play a 6 voices chord on a 20 speaking ranks organ. This is roughly 128 pipes (so 2 at the power 7). It is not correct (mathematically) but, as a rule of a thumb we need 7 * 3 dB = 21dB “room” below saturation (0dB). i.e. probably a single pipe should not be recorded over something like -21dB loudness.
[Finding] At 0dB, in the analogue world, we were “hot” (saturation). At 0dB in the digital world, we are just dead (horrible cracks). Reaching saturation (0 dB) in the digital world is NOT at option. We are to be careful sooner, or spend hours guessing and trying later...
Back to the note-per-note organ recording:
[Finding] Anything below -40 dB is NOT enough.
[Finding] Anything louder than -20 dB will probably not allow a safe "mix" to reach 0 dB at the end of the day...
[Finding] -24 dB for a foundation stop (8') is OK.
[Finding] -21 dB for the loudest reeds (pedal Trumpet) is OK.
[Finding] If -24 dB is you target average, never go lower than -27dB in 8'.
[Finding] Allow -3dB per octave above 8' (i.e. the Siffloete 1' can only have -24-3*3= -33dB)
[Finding] Symmetrically, 16' stops shall be 3dB "louder". The "C" of a pedal Principal 16' can be safely as loud as -21dB.
[Finding] Baroque Mixtures shall be considered as a 8' Principal.
[Question] Can anyone confirm does figures? Has a systematical study be done? can anyone explain those mere guesses/feelings?
[Partial conclusion] Aim at -24dB (flues) to -21dB (reeds) for 8' stops. -27 to -24 for 4'. -30 for 2' (we have no reeds at this level...). -33 for 1'. -24bD to -21dB for 16'. -24dB for Mixture stops.
During the recording, the audio interface direct monitoring is OK.
Try with 4-5 notes (one in each octave). Use the audacity vu-meter (if nothing better is available).
C) Samples processing 1 : generic statements
C1) quality standards
Especially when starting with "raw" recordings at 44.1kHz 16 bits, we shall avoid too many digital transformations. It's always better to do the more transformations as possible in single pass.
[Finding] Always keep your original ("raw") recording. You'll go back to them sooner or later.
[Finding] What causes quality loss:
- re-sampling (i.e. changing the sampling frequency. Very high loss ; try to avoid. If not possible, do it as late as possible in the processing chain. “Down-sampling” (96->44.1 is OK but shall be done ONCE, and NEVER more.) Leave “Up-sampling” to DVD movies fans and their 3 feet screens : an algorithm cannot find MORE information than provided in the input, unless “guessing” it. And this is definitively not what we want...
- noise reduction (very high loss, even if the signal/noise ratio obviously improves). But I would suggest to begin with that.
- tuning (pitch change) is NOT an obvious process : sampling frequency, duration and length are linked. And when we change pitch, we do not want to change the duration, especially not the length of the transient or the reverb phase. “Easy” pitch change effects (changing the duration) offered by audio tools are NOT what we look for. Tuning (pitch change) costs FFT+translation in the frequency domain+back_FFT. Tuning is NOT lossless. Keep that for the end to the process.
- loop points and release determination are primordial for quality. This can be done as often as you want, as it does not change the sample, but only the playback parameters.
- Reverb shall only be added at the very end of the process. Anything simpler than IR algorithms (convolution to an Impulse) is of no use. OR: keep it "dry" and use an external reverb on your virtual organ.
- Never, ever, try to "go back" in the process (e.g. change pitch twice, re-sample twice, etc...) Resume at the last step that was OK.
C2) software tools
- Recording: "Audacity" is free and OK. http://audacity.sourceforge.net/
- Cutting the samples (the stop wav in as many notes samples as the stop has notes). I did find nothing satisfactory. Doing it "by hand" is tedious. I have written a script that does the job.
- Nose reduction: "SOX" (opensource). http://sox.sourceforge.net/
- Tuning: SOX is OK once you know by how much cents you want the move the note up or down. Additional help from scripts is needed to calculate the cents. Scripts may include a temperament table to test non-equal temperaments (OR: not to LOOSE a fine non-equal temperament).
- Loop points: I did find nothing satisfactory. Some loop software are OK, most not not free nor open source. "autoloop_0_1b" seems to be the only way out when avoiding to pay for software. http://www.appletonaudio.com/
Autoloop has been designed for pipe organs. For the moment, I rely on commercial software. Doing it by hand is not option (unless you have weeks to spend).
- Organ definition file creation: it can easily be done by hand (text editor), but can can be tedious. Again, nothing better than some scripting to generate the “.organ” files according to the file system. (We just try to test, not to create the best ever seen layout for “Grand_orgue”)
"Audacity", "SOX", "Autoloop", "Python" (www.python.org
) for scripting.
When anything specific (i.e. not covered by generic use software), the best is to do a Python script.
Python, with the "numpy" and "scipy" (and matplotlib if you want to do fine analysis or printouts of spectral charts showing the partials of your waves) extensions is opensource, free, and brings something as powerful as mathlab, enough for our requirements.
[Question] Has anyone a good understanding of this “dithering” process, enabling to avoid the quantization noise? (Especially when using SOX) I rely on given example, but am not very conformable with it.
C3) filesystem organization
[Finding] The "Grand-orgue" organization is simple and fine:
Some group the stops by keyboard (useless additional complexity).
[Finding] A nice naming rule for samples file is <midi_note_number>-<note_name>.wav (e.g. 036-C.wav for a lower "C")
Then, the <Organ_folder>/<Organ> folder holds the documentation. Documenting the recording (organ, hardware, software, conditions, methodology, processing) is VERY important.
I also add a <Organ_folder>/software folder to hold the scripts and their parameters
D) Samples processing 2 : noise reduction
[Question] Anything free better?
a) With Audacity, select and save some seconds (5-10) of "background noise" (probably mainly the blower), as "steady" as possible (no clicks or cracks). Call the wav "silence.wav".
b) Generate the noise signature:
sox -c 2 silence.wav -n trim 0 2 noiseprof my.noise-profile
c) Apply on each recorded stop, e.g., for "Bourdon8":
..\sox-14-3-1\sox Bourdon8.wav Bourdon8nr.wav noisered my.noise-profile 0.3
You will can a noise-reduced Bourdon8nr.wav file.
Check, and possibly tune parameter (0.3)
E) Samples processing 3 : extract notes from recordings
Not a difficult task if the signal/noise ratio is correct.
-> Algorithm, parameters and Python script will be provided.
We have a noise-reducted wav file holding all the notes of <stop>, called <stop>nr.wav, and the script creates the N (=number of notes in the stop) wav files in the <stop> folder.
<to be updated w/ script>
F) Samples processing 4 : setting loops and release points.
The funny part.
At this point, the sample set is not ready to be used on a virtual organ.
[Finding] Virtual organs are not like standard samplers. They have to tackle with "very long sustain" notes (and do not have RAM space to keep 1-2 minutes long samples. Anyway, who would be sure that any organ note last less than 2 mins?). So, virtual organs use “loops”. (Not to be confused with "drum loops" ("acid" loops designed to place the beats) when googling at the subject.)
The (simple) looping process sets two points (markers) in the sample (start, stop).
The virtual organ software will read the sample, and if more is needed after the "stop" point, it will jump back to "start".
So, the start and stop points shall be carefully chosen. The power (loudness) at each point shall be the same, and the wave shall smoothly match (or we'll here a "click" each time the looping is done).
When the keyboard note is released, the virtual organ sampler jumps to a "release point" in the wave, which corresponds to the actual release that was recorded.
Now the bad part : the human ear is VERY sensitive about those loops. When the loop is too short, the sound will be dull ("dead") like a poor "electronic" organ with short samples.
[Finding] Finding loop points is not a job. It's an art. Some virtual organs use multiple loops to enrich the behavior of the sustained notes.
[Question] Any solution for that (I was not able to do it with Autoloop) ?
[Note&finding] If you plan to use Grand_orgue only, multi-loops are not an issue. It does not support multi-loops, and the feature would probably be difficult to implement. Reading the source shows that “Grand_orgue” looks for the *longest* loop in the SMPL chunk of the wav file, and the *last* cuepoint in the CUE chunk of the wav file. of the wav file for the Release point.
[Finding] Release point determination is, on the other hand, very easy. A simple monitoring of the RMS power of the sample will do the job. This is very easy to script.
<To be continued> (A script will be provided that writes the loop and release points in the “chunks” (i.e. not the sound data, only the playback parameters) of the wav files.
G) Samples processing 5 : (re-)tuning, reverb
Can be done with SOX. Not mandatory.
<To be continued> (A lot of fun, here.)
H) Creation of the package
Again, a script makes the job easier. But taking an existing (simple) .organ file and editing by hand costs only a couple of minutes. Try to create a working generic (plain) .organ file (without any coupler, tremulant, division specification, divisionals, bells or whistles...). The .organ file format is straightforward once removed all those gizmos that can be added later.
<To be continued>