Example of feedback within an OMR system

Figure 1 shows a small part of a sheet of music. The optical music recognition system begins by examining the page and deciding which objects on the page are potential staff systems. Then the staff processing module identifies the staff lines and removes any objects that are super-imposed on the staff system (Figure 2).

Figure 1 - Original bitmap

Figure 2 - Staff lines detected and removed

Notice in Figure 2 how objects that only just brush a staff line become broken. All objects removed from the staff systems are processed by the primitive identification module. Figure 3 shows the unrecognised objects. This includes segments of a flat and the bass clef, where both have been distorted by the staff line removal process.

Figure 3 - Unrecognised objects
Because of the modular design of the prototype system, it was relatively easy to create a new module that tries to correct this type of deformation. A Primitive Segmentation specialist was written that would join unknown objects in close proximity to create a new object. This module then requests that the new objects are recognised, and the controlling process decides to call the primitive identification specialist again.
Figures 4 and 5 show the broken flat and broken bass clef now correctly identified by the second call to the identification module. One of the flats has been incorrectly recognised as a natural. Ideally, the later musical semantics module could guess that it should be a flat due to its position in a potential key signature and relay this information back to the primitive identification module, which would (also ideally) re-evaluate its classification of the primitive with the knowledge that it is possibly (or probably) a flat instead of a natural.

Figure 4 - Bass clefs after second identification call

Figure 5 - Flats after second identification call
Figure 6 shows objects that matched the primitive patterns for "4"s and "5"s. The primitive assembly module has a rule that creates a new object for time signatures if a number is found above and in close proximity to another number. Figure 7 shows time signature objects.

Figure 6 - '4' and '5' objects

Figure 7 - Assembled Time Signatures (with staff lines)

After this, the semantics stage creates a 2-dimensional graph over the found objects. There are 3 types of nodes - "primary" nodes for musical events with duration (such as notes and rests), "secondary" nodes for objects that directly affect primary nodes, and special marker nodes that are used for marking the start and end of systems, staffs within systems, and bars. These nodes are created based on separate rules for different notations --- for the given image, the "Common Music Notation" semantics functions were used. Eventually this module should be able to generate requests to correct many automatically detected errors, such as incorrect bar durations, or mis-matched key signatures and/or time signatures between staffs in the same system.

Finally, when no requests are generated by the semantics module, (or none are accepted by the coordinating module), and files in various formats are outputted. For example, files are created showing the results of the primitive identification, files are created in a musical format (GUIDO) based on the lattice, and files are created (generally for debugging) giving graphical representations of the lattice.

Other examples of feedback

The prototype system also uses feedback in a few other places --- for example, if no staff systems are found, then a request is sent back to the page layout module to "look harder" for some systems. The page layout module will then perform a hough transform and rotation in case the page was not scanned in straight enough. Similarly, if systems are found, but the staff processing module thinks that they are skewed, then a request is sent to the page layout module to correct skew.

Note that the staff processing module doesn't talk directly to the page layout module; it doesn't even know about the existence of other modules. All it does is sends a request to the coordinator, such as "remove skew". All that is required is that the modules that can generate and/or fulfill requests agree on the names used.

Last modified 20 Mar 2002