Persistent near real‐time passive acoustic monitoring for baleen whales from a moored buoy: System description and evaluation

Managing interactions between human activities and marine mammals often relies on an understanding of the real‐time distribution or occurrence of animals. Visual surveys typically cannot provide persistent monitoring because of expense and weather limitations, and while passive acoustic recorders can monitor continuously, the data they collect are often not accessible until the recorder is recovered. We have developed a moored passive acoustic monitoring system that provides near real‐time occurrence estimates for humpback, sei, fin and North Atlantic right whales from a single site for a year, and makes those occurrence estimates available via a publicly accessible website, email and text messages, a smartphone/tablet app and the U.S. Coast Guard's maritime domain awareness software. We evaluated this system using a buoy deployed off the coast of Massachusetts during 2015–2016 and redeployed again during 2016–2017. Near real‐time estimates of whale occurrence were compared to simultaneously collected archived audio as well as whale sightings collected near the buoy by aerial surveys. False detection rates for right, humpback and sei whales were 0% and nearly 0% for fin whales, whereas missed detection rates at daily time scales were modest (12%–42%). Missed detections were significantly associated with low calling rates for all species. We observed strong associations between right whale visual sightings and near real‐time acoustic detections over a monitoring range 30–40 km and temporal scales of 24–48 hr, suggesting that silent animals were not especially problematic for estimating occurrence of right whales in the study area. There was no association between acoustic detections and visual sightings of humpback whales. The moored buoy has been used to reduce the risk of ship strikes for right whales in a U.S. Coast Guard gunnery range, and can be applied to other mitigation applications.


| INTRODUC TI ON
Marine mammals are an integral part of the ocean ecosystem and many are impacted by human activities, but like most marine organisms, their occurrence, distribution and abundance are a challenge to monitor from unmanned ocean observing systems. Human observers have traditionally detected marine mammals during visual surveys, relying on the animals to return to the sea surface periodically to breathe where they can be visually detected. This approach is often expensive, as it requires a large team of observers and a ship or aircraft.
Moreover, visual surveys are limited by weather and sighting conditions, such as fog, rain, heavy seas and darkness. For their expense, visual surveys are often inefficient for persistent real-time monitoring of marine mammal occurrence, albeit for other tasks, such as photo identification, health assessment and abundance estimation, visual surveys remain an essential observing methodology for many species.
In recent decades, passive acoustic recorders have become extremely popular for detecting vocally active marine mammals, as they can operate continuously for periods of months to years (Mellinger, Stafford, Moore, Dziak, & Matsumoto, 2007;Van Parijs et al., 2009). Widespread use of passive acoustics for persistent marine mammal monitoring faces two challenges: (a) most passive acoustic recordings are only available for analysis after instruments are recovered and (b) analysis of passive acoustic recordings is typically slow and tedious, involving trained human analysts that pore over large volumes of acoustic data or verify automated detections to assess occurrence. In many cases (particularly research applications), the delays in access and analysis are perfectly acceptable, but for mitigation applications or those involving real-time response, most passive acoustic recorders are unhelpful.
There is an urgent need for real-time information on the occurrence of marine mammals for both science and mitigation applications. Such a real-time capability can improve the efficiency of traditional visual-based research efforts by identifying areas where animals are likely to be located, and can provide critical occurrence information in sensitive areas where human activities must be managed to avoid harmful interactions with marine mammals. Van Parijs et al. (2009) reviewed several real-time or near real-time passive acoustic systems, including the Cornell University North Atlantic right whale detection buoy, which has been used to reduce ship strike risks from liquefied natural gas tankers transiting the shipping lanes approaching Boston, Massachusetts for over a decade. The work described here took inspiration from Cornell's innovative and pioneering efforts. We developed a system to monitor the occurrence of baleen whales in near real-time from long-endurance Slocum ocean gliders (Baumgartner et al., 2013), and have in recent years adapted this system to operate from a purpose-built moored buoy. With the development of an analyst protocol, we have also formalized the review of detection data in near real-time to substantially improve the accuracy of the system. This paper describes the moored buoy system and analyst protocol, and evaluates the accuracy of near real-time whale occurrence estimates derived from a buoy located near the Massachusetts coast. This evaluation compares occurrence estimates derived in near real-time to those derived from (a) a review of simultaneously collected archived audio and (b) visual sightings collected by aerial surveys for humpback Megaptera novaeangliae, sei Balaenoptera borealis, fin Balaenoptera physalus and North Atlantic right whales Eubalaena glacialis.

| System overview
A moored buoy was designed to deliver detection data in near realtime from a passive acoustic instrument on the sea floor to a shoreside computer where an analyst could review the data to determine baleen whale occurrence. The passive acoustic instrument used in this system was the digital acoustic monitoring (DMON) instrument that is capable of running the low-frequency detection and classification system (LFDCS) firmware developed to identify baleen whale calls. The mooring hardware allowed the delivery of power and data between the sea floor and a surface buoy via stretch hoses that isolated the motion of the surface buoy from an aluminium frame on the sea floor to which the DMON was attached. The surface buoy contained a platform computer to store DMON/LFDCS data and to transmit these data to shore every 2 hr via the Iridium satellite system. Upon reception, the DMON/LFDCS detection data were immediately displayed on a publicly accessible website and were reviewed once a day by an analyst. The results of the analyst review were posted on the website and disseminated automatically to researchers, managers, the United States Coast Guard and other stakeholders via email and text messages.

| Digital acoustic monitoring instrument
The DMON instrument is an acoustic hardware device that can (a) sample from up to three integrated hydrophones, (b) process and 4. The moored buoy has been used to reduce the risk of ship strikes for right whales in a U.S. Coast Guard gunnery range, and can be applied to other mitigation applications.

K E Y W O R D S
acoustics, autonomous, buoy, conservation, mitigation, real-time, ship strikes, whale record the resulting audio with a programmable Texas Instruments TMS320C55 digital signal processor (DSP) and 32 GB of flash memory and (c) communicate detection (and other) information to an external computer using serial input/output lines (Johnson & Hurst, 2007;Baumgartner et al., 2013; see supporting information). The instrument is extremely low power, making it ideal for use on powerlimited autonomous platforms. For the application described here, the electronics board, integrated lithium battery and hydrophones were packaged in an oil-filled, acoustically transparent urethane housing.

| Low-frequency detection and classification system
The LFDCS was originally developed to detect and classify the tonal sounds of baleen whales in archived audio (Baumgartner & Mussoline, 2011), but was later ported to run on the DMON instrument in realtime (Baumgartner et al., 2013). Detailed descriptions of the LFDCS can be found in Mussoline (2011) andBaumgartner et al. (2013), but briefly, the system builds a spectrogram using the short-time Fourier transform, creates pitch tracks of tonal calls in the spectrogram (a pitch track is a time series of frequency-amplitude pairs that describes a sound in a manner analogous to a series of notes on a page of sheet music), and classifies each call by comparing attributes of the pitch track to those of known call types in a call library using quadratic discriminant function analysis. For the application described here, audio was sampled at 2000 Hz, compressed using a lossless algorithm (Johnson, Partan, & Hurst, 2013) and archived to flash memory on a 50% duty cycle (30 min every hour).
These recordings were accessible upon recovery of the mooring and used to evaluate the accuracy of near real-time detections.
During operation aboard an autonomous platform, the DMON/ LFDCS regularly relays summary detection data, detailed detection data, status information (e.g. system voltage, available memory) and background noise estimates to the platform computer. Summary detection data consist of tallies of classified calls for every call type in the call library, which are relayed to the platform computer every 15 min (review and evaluation of detection data, described below, are organized in these 15-min tally periods). Detailed detection data are sent to the platform computer in real-time and include pitch tracks and associated classification information, but only up to a maximum of 8 kilobytes of detection data per hour. This data transmission limitation is designed solely to reduce operating costs by limiting (a) the amount of data sent through the Iridium satellite service and (b) time spent by the analyst reviewing pitch track data; however, the data transmission rate is configurable and can be eliminated altogether if the associated transmission and analysis costs can be accommodated.

| Quiet mooring
We utilized a mature mooring design that allowed both quiet operation as well as delivery of digital data from the sea floor to shore (Figure 1).
The DMON was housed in open cell foam and a urethane fairing and affixed to a bottom-mounted aluminium frame called the multi-function node (MFN), which in turn was attached to the surface buoy by stretch hoses. These hoses can stretch to nearly twice their relaxed length (Paul & Bocconcelli, 1994), thereby absorbing the motion of the buoy in rough wave conditions and keeping the MFN acoustically quiet. The hoses also contain helically wound conductors that allow power and data to be delivered between the buoy and the DMON. The surface buoy contains a platform computer, Iridium and global positioning satellite (GPS) antennas and a 450-Ahr battery pack to power all system components.
The platform computer receives and stores DMON/LFDCS data sent in real-time via the stretch hoses, and once every 2 hr, transmits these stored data to shore via an Iridium satellite modem ( Figure 2). The buoy was designed to operate at sea for at least one year.

| Near real-time analysis
All data are received by a dedicated shore-side server, immediately processed and displayed on a publically accessible website (dcs.whoi.edu) for review by an experienced analyst ( Figure 2). For each 15-min tally period for which detailed detection data were transmitted, pitch track data and associated classification information are displayed on a single webpage in stacked 1-min panels (e.g. Figure 3a). The analyst reviews these data and fills out a form on the webpage for each monitored 15-min period to indicate whether each of the monitored species was "detected", "possibly detected" or "not detected" during the tally period; the form allows the entry of notes as well.
The analyst uses a standardized and documented protocol (available at dcs.whoi.edu/#protocol) developed jointly by the National Oceanic and Atmospheric Administration's Northeast Fisheries Science Center (NOAA NEFSC) Passive Acoustics Group and the Woods Hole Oceanographic Institution to determine how a tally period should be scored. In general, a tally period is scored as "detected" when there is convincing evidence of a species' acoustic presence, "possibly detected" when there is some evidence of acoustic presence, but the evidence is not completely convincing, or "not detected" when there is no reasonable evidence of a species' acoustic presence (see supporting information for further explanation). We chose to emphasize minimizing false detections when developing the protocol, so the analyst is encouraged to be conservative (i.e. cautious) by only scoring tally periods as "detected" when there is strong evidence of acoustic presence.
The analyst reviews individual pitch tracks, associated classification information, and the context in which individual pitch tracks occur to assess species occurrence. Three of the four species monitored for this study make calls in distinct patterns that can be easily discerned in the pitch track displays. These include humpback whale song (Figure 3), sei whale low-frequency doublets or triplets and fin whale 20-Hz pulse sequences. Assessing context (i.e. pitch tracks in temporal proximity to a pitch track of interest) is particularly helpful when identifying right whale upcalls, which can be confused with a similar upsweep sometimes present in humpback whale song (authors' personal observations).
In practice, the analyst reviewed detection data for this study once a day, usually between 07.00 and 10.00 local time, and the resulting near real-time occurrence estimates were displayed on the website within minutes of the analyst's review. The near real-time occurrence estimates were also (a) distributed directly to interested users via email and text messages, eliminating the need for users to check the website constantly, (b) made available in Whale Alert (www.whale alert.org), a smartphone/tablet app for iOS and Android platforms and (c) viewable in the U.S. Coast Guard's One View software to easily allow Coast Guard personnel to monitor whale presence. four of the monitored species were present and to make the manual audio analysis manageable. Spectrograms and audio were reviewed visually and aurally, respectively, to determine species occurrence during the entirety of each 15-min tally period (regardless of the duration that the same tally period was actually monitored in near realtime). Like in the near real-time analysis, each 15-min tally period in this audio analysis was scored as "detected", "possibly detected" or "not detected" based on how convincing the acoustic evidence was. We assessed the accuracy of the near real-time analysis by treating the retrospective audio analysis as the truth and comparing the results of the two analyses using confusion matrices. Only periods scored as either "detected" or "not detected" in both the near real-time and audio analyses were assessed (periods scored as "possibly detected" in either the near real-time or audio analyses were assessed separately). A variety of performance metrics were used to quantify the accuracy of the near real-time analysis (Figure 4).

| Evaluation of real-time occurrence estimates with archived audio
Cases in which there was disagreement between the near real-time and retrospective audio analyses were examined to determine the F I G U R E 1 Design of the DMON/ LFDCS mooring, including surface buoy, stretch hoses and multi-function node (MFN) to which the DMON was affixed. The location of the DMON/LFDCS buoy southwest of Martha's Vineyard, Massachusetts is also shown reason for the disagreement. Finally, logistic regression was used to determine if the probability of missed occurrences in near real-time was related to the amount of daily calling activity.

| Evaluation of real-time occurrence estimates with visual sightings
The accuracy of near real-time whale occurrence estimates was also evaluated with whale sightings collected by aerial surveys conducted near the DMON buoy. Comparison of occurrence estimates derived from passive acoustics and visual observations is challenging because of the significant differences in the detectability of whales between the two methods. Neither passive acoustics nor visual surveys are perfect detection systems; nevertheless, when one system correctly detects a whale, there is a reasonable expectation that the other system should detect it as well.
We compared occurrence estimates derived from aerial surveys flown near the Nomans Land buoy site to the near real-time passive acoustic occurrence estimates derived from the buoy using log odds ratio tests. Aerial surveys were conducted by the New England Aquarium (NEAq) and the NOAA NEFSC using standard large whale survey protocols (two observers on either side of the plane, 229-305 m altitude, 185 km/hr speed).
Visual occurrence was evaluated on a daily basis within particular radii of the buoy for the aerial survey observations (within 20-60 km in 10 km increments), and acoustic occurrence was evaluated within particular time intervals before the start of the aerial survey for the near real-time passive acoustic observations from the buoy (within 12-72 hr in 12 hr increments; note that only the period before a survey was examined so that acoustic occurrence prior to the survey could be used prospectively to predict visual occurrence during the survey-see end of this paragraph).
The log odds ratio test evaluates the ratio of the odds of acoustic detection when a species is visually present to the odds of acoustic detection when a species is visually absent. The log odds ratio was evaluated using a logistic regression between the near realtime passive acoustic observations (dependent variable) and the visual observations (independent variable). To account for multiple comparisons over several radii and time intervals, we used a Bonferroni adjusted alpha threshold of 0.00167 (α Bonferroni = α ÷ 5 F I G U R E 2 Diagram of data flow from the DMON mounted on the multifunction node (MFN) to a shore-side server via the stretch hoses, surface buoy and Iridium satellite service. These data are displayed on a website and reviewed by an analyst to produce species-specific occurrence estimates for each monitored tally period. Occurrence estimates are then distributed to users via a publically accessible website as well as email and text messages radii ÷ 6 time intervals, where α = 0.05) to determine the significance of log odds ratios. In addition to comparing daily occurrence estimates, we also used logistic regression to assess whether the probability of detecting a species during an aerial survey was related to the percentage of near real-time tally periods scored as "detected" within 12-72 hr prior to the start of the survey.

| Statistical treatment
Whenever percentages were used in correlation or regression analyses, they were transformed using the arcsine square-root transform: 1995). Axes of transformed values F I G U R E 3 (a) Display of detection information transmitted in near real-time as it appears on the website, which includes pitch tracks (coloured lines; quiet sounds in cool colours, loud sounds in warm colours) and associated classification information for classified calls (numbers below some pitch tracks).

| Evaluation of real-time occurrence estimates with archived audio
Comparisons between occurrence estimates determined in near real-time and those determined during the audio analysis indicated remarkably low false detection rates (Tables 1 and 2). Of all the species, only fin whales had a false detection, and this occurred in only a single 15-min period. Missed detections rates for all species ranged from 27% to 67% during 15-min tally periods and 12%-42% over daily time scales (Table 2). Fifteen-minute tally periods were scored in near real-time as "not detected" when there was evidence of acoustic presence in the archived audio for several reasons (Table 3). For right whales, the most common reason (67% of missed detections) was because upcalls occurred after the 8-kilobyte per hour limit was reached and before the end of the 15-min tally period (i.e. upcalls were available for the audio analyst to detect, but not available for the near real-time analyst to detect). More often for other species, tally periods were scored as "not detected" in near real-time because  For fin whales, which require the detection of several 20-Hz pulses with a constant inter-pulse interval, tally periods were often scored as "not detected" because not enough pulses were identified in near real-time to be confident of the species' presence. Over daily time scales, the probability of missed detection was significantly related to the amount of calling activity, measured as the percentage of tally periods that were scored as "detected" in the audio analysis during a single day ( Figure 5). Fitted logistic regression models suggested that if 12%, 33%, 19% and 22% or more tally periods were scored as "detected" during a day in the audio analysis (i.e. if observed calling rates were modest or high), then the probability of daily missed detections in near real-time dropped to 10% or less for right, humpback, sei and fin whales respectively (i.e. then the chance of missing occurrence in near real-time was low) ( Figure 5).
The vast majority of tally periods scored as "possibly detected" in near real-time for right, humpback and fin whales were scored as "detected" during the audio analysis (Table 4). Together with the very low false detection rates, this indicates that the analyst was quite cautious in scoring periods as "detected" (as encouraged by the protocol). For sei whales, roughly half of the tally periods scored as "possibly detected" were determined to have evidence of sei whale presence in the audio analysis.
The time series of "detected" and "possibly detected" scores de-  were particularly high for right and fin whales (r 2 = 0.904 for both).
Slopes of the corresponding regressions were less than 1 for all species, indicating that acoustic detection rates were underestimated in near real-time. This is not surprising considering the missed detection rates described previously.

| Evaluation of real-time occurrence estimates with visual sightings
Of all the species examined, the best agreement between visual and near real-time acoustic detections was observed for right whales (Figure 7a; Table S1). The log odds ratio test was significant (p < α Bonferroni ) for most radii and time intervals, but the best agreement between visual and acoustic occurrence was within 30-40 km of the buoy and 24-48 hr prior to an aerial survey (Table S1) Table   S1). In contrast to right whales, there were no associations observed between visual and near real-time acoustic detections of humpback whales at any radii or time interval (Figure 7b; Table S1).
Sei whale occurrence estimates from aerial surveys and near real-time passive acoustic monitoring were significantly associated only at 30 and 40 km radii around the buoy, and only within 24 hr of an aerial survey ( Figure 7c; Table S1). Acoustic detection rates were modest when sei whales were encountered by the aerial surveys (6 of 9 days; 66.7%), but acoustic detection rates were appropriately low when sei whales were not encountered by the aerial surveys (3 of 27 days; 11.1%). Fin whale occurrence estimates from aerial surveys and near real-time passive acoustic monitoring were significantly associated within 40 km of the buoy and 24, 36 and 72 hr prior to an aerial survey (Figure 7d; Table S1). Fin whales were most often (10-11 of 11 days; ≥ 91%) acoustically detected on days when they were sighted by the aerial surveys within 40 km of the buoy, but fin whales were also acoustically detected when the aerial surveys did not encounter fin whales (7-14 of 25 days; 28%-56%).
The probability of encountering right, sei and fin whales during an aerial survey was significantly associated with near real-time acoustic detection rates of those species prior to the aerial survey ( Figure 7e,g,h; Table S1), but there was no such association for humpback whales (Table S1). Fitted logistic regression models suggested F I G U R E 5 Probability of missing occurrence in near real-time over daily time scales as a function of daily calling rates derived from the audio analysis (i.e. the daily percentage of tally periods scored as "detected" in the audio analysis) for (  that detecting right whales in just 1%-4% of all reviewed tally periods within 24-72 hr prior to an aerial survey was associated with a 50% probability of encountering a right whale within 30-50 km of the buoy during the aerial survey ( Figure 7e; Table S1). Similarly, detecting right whales in just 6%-15% of tally periods over 24-72 hr prior to an aerial survey was associated with a 90% probability of encountering a right whale within 30-50 km of the buoy during the aerial survey (Table S1). Logistic regression models suggested that detecting sei whales in 4%-6% and 13%-20% of tally periods over 48-72 hr prior to an aerial survey was associated with a 50% and 90% probability of encountering a sei whale within 30-60 km of the buoy during the aerial survey respectively (Figure 7g; Table   S1). Detecting fin whales in 16%-17% and 61%-64% of tally periods within 24-48 hr prior to an aerial survey was associated with a 50% and 90% probability of encountering a fin whale within 40 km of the buoy during the aerial survey respectively (Figure 7h; Table S1).

| D ISCUSS I ON
The mooring design was successful, allowing for quiet continuous operation for two yearlong deployments in an area that is exposed to intense New England storms and oceanic swell owing to unlimited fetch from the south. Near real-time false detection rates were virtually zero, indicating that when a tally period is scored as "detected", the analyst is nearly 100% correct. Such high accuracy in near real-time is attributable to (a) having an analyst as part of the detection process and (b) having a protocol that stresses conservatism in scoring. The greatest advantage of having an analyst review the detection data is the assessment of context. The human analyst can consider context in a way that is not yet available in automated detection and classification systems for marine mammal sounds. Many automated detectors attempt to determine species presence based on a single call with little or no regard for the noise environment, other sounds in temporal proximity to a call of interest, or patterning in calls. An analyst can take such contextual information into account (when reviewing either archived audio or time series of pitch tracks, e.g. Figure 3), which increases accuracy significantly. The need for low false detection rates provided by the analyst must always be weighed against the cost of the analyst; we found in our study that the analyst spent about 30-45 min per day per platform reviewing pitch tracks and scoring tally periods.
We have developed a protocol that encourages the analyst to score "detected" only with convincing evidence of a species' acoustic presence. It is important to recognize that the protocol was an analyst would score a tally period as "detected" if there was any evidence of a species' acoustic presence (rather than convincing evidence, which was the criterion used in this study). Such a change in the protocol for our study would have resulted in a modest increase in false detection rates for right, humpback and fin whales, but a substantial increase in false detection rates for sei whales (Table 4).
The daily missed detection rate for right whales was 27% (Table 2), but observed daily calling rates (i.e. rates of received calls) were low on all of the 7 days when presence was missed in near realtime (i.e. on days when less than 10% of tally periods were scored as "detected" during the audio analysis; Figure 5a). For such days with low calling rates, right whales are only acoustically available to be detected in near real-time in just a few tally periods, and with missed detection rates of 42% for individual 15-min tally periods (Table 2), positive detection is not always possible. Unlike humpback or fin whales that are prodigious callers once a calling bout is initiated, F I G U R E 7 (a-d) Acoustic detection rates when (a) right, (b) humpback, (c) sei and (d) fin whales were visually detected during aerial surveys (y-axis) and when not visually detected during aerial surveys (x-axis) within particular radii of the buoy and within particular time intervals prior to the start of an aerial survey (data in Table S1). Large open symbols are for radii and time intervals that have significant log odds ratio tests (p < α Bonferroni ); small filled symbols have non-significant log odds ratio tests. Symbols are jittered by less than ± 1% to improve clarity. Symbols located in the upper left-hand corner of the plot would indicate excellent agreement between the visual and acoustic observations. (e-h) Logistic regression model results showing the probability of encountering a (e) right, (f) humpback, (g) sei or (h) fin whale within particular radii of the buoy during an aerial survey against the percentage of tally periods with those species scored as acoustically detected in near real-time within particular time intervals prior to the start of an aerial survey. Fitted regression lines are shown for significant models only (drop-in-deviance test had p < α Bonferroni ; data in Table S1), whereas the grey area indicates the standard error for all fitted lines plotted on top of one another right whales often produce sporadic upcalls in low numbers without pattern, so the opportunities for detection are fewer. However, if the 15-min missed detection rate is constant and missing detections in a tally period is independent of missing detections in the next tally period, then the probability of missing right whale acoustic presence in two tally periods in a day is 42% × 42% = 17.6%, and the probability of missing right whale presence in three tally periods in a day is (42%) 3 = 7.4%. Hence, as calling rates increase, we expect the probability of daily missed detection to decrease, even for a sporadic caller (as observed in Figure 5a). There may be some benefit to sending more pitch track data than allowed by the 8-kilobyte limit used in this study, since many missed detections at the 15-min time scale were caused by the cessation of pitch track transmission (Table 3); however, the additional costs of data transmission and analyst time must be weighed against the potential reduction in missed detection rates at the 15-min time scale, which presumably will help to lower the missed detection rate at the daily time scale. The strong association between acoustic detections and aerial survey sightings ( Figure 7a) suggest that silent right whales were not especially problematic for estimating occurrence of the species in the study area; when right whales were seen, they were typically also heard, particularly over 30-40 km spatial scales and 24-48 hr temporal scales.
Humpback whales had higher missed detection rates than right whales at 37% on daily time scales (Table 2), but missed detections were also strongly associated with low calling rates (Figure 5b).
Our analysis of missed detections at the 15-min temporal scale suggested that faint calling was often the cause of missed humpback whale detections (Table 3). Because humpbacks are most easily identified by the numerous patterned calls that make up their songs, faint singing is more detectable in spectrograms by an analyst than if they made few sporadic calls like right whales (i.e. the faint pattern can be recognized in the spectrograms better than a faint single call). While an analyst can identify this faint singing, such faint song units are difficult to pitch track. Therefore, it is likely that the higher missed detection rate for humpbacks is attributable to the difference in detection capabilities of a human and the pitch-tracking algorithm. Although there was a significant correlation between occurrence estimates derived from the near real-time and audio analyses (Figure 6f), there was no association between acoustic detections and visual sightings (Figure 7b). We suspect that this lack of association is related to our use of song to identify humpback whales acoustically. Song is produced by males (Payne & McVay, 1971), and one can imagine a situation where a single male is near the buoy singing; this single animal is easily detected acoustically, but difficult to detect visually during an aerial survey. Conversely, one can imagine a group of several females that are easy to detect visually, but very difficult to detect acoustically since none of the females are singing. Hence when using song for humpback whale detection, there may not be a strong relationship between what one hears and what one sees.
Sei whales had the highest missed detection rates of any of the other species on both 15-min and daily time scales (Table 2). No one factor stood out strongly as the reason for these higher missed detection rates at the 15-min temporal scale (Table 3), but like all of the other species, missed detections in near real-time were strongly associated with low calling rates at the daily time scale (Figure 5c).
Sei whales produce sporadic calls in low numbers like right whales, but sometimes with very short patterns (doublets or triplets; Baumgartner et al., 2008). Near real-time occurrence estimates were significantly correlated with occurrence estimates derived from the audio analysis (Figure 6g), and were significantly associated with sightings at 30-40 km spatial scales and 24-hr time scales (Figure 7c). Acoustic detections were appropriately low when sei whales were not sighted during aerial surveys, but acoustic detections were only modest when sei whales were encountered during aerial surveys. This could certainly be a consequence of missed detections, but also silent animals.
Fin whales had the lowest missed detection rates of any of the other species on both 15-min and daily time scales (Table 2). Fin whales call in trains of 20-Hz pulses that are separated by a nearly constant inter-pulse interval (Morano et al., 2012;Watkins, Tyack, Moore, & Bird, 1987). The pattern of these pulses is easily recognized both in an audio analysis and in pitch tracks when correctly classified. We rely strongly on the automated classification of 20-Hz calls since the frequency resolution of the spectrogram used by the DMON/LFDCS in the 20-Hz call band is very coarse. When calls are not classified because of interfering sound (including calls from other fin whales) or a clear pattern with a constant inter-pulse interval is not apparent, our protocol encourages the analyst to be sceptical. As with the other species at daily time scales, missed detections in near real time were strongly associated with low calling rates (Figure 5d), but there was very good agreement between daily calling activity derived from the audio and near real-time analyses ( Figure 6h). There was a significant association between acoustic and visual occurrence estimates at 40 km spatial scales and 24-72 hr temporal scales (Figure 7d). At these scales, fin whales were nearly always acoustically detected when sighted by the aerial surveys, but they were also acoustically detected when not seen by the aerial surveys. This pattern could be caused by false detections, but we observed that the near real-time false detection rate is nearly 0% for fin whales (Table 2). It is more likely that fin whales sometimes go undetected by the aerial surveys, perhaps because they do not often aggregate in large groups (Hain, Ratnaswamy, Kenney, & Winn, 1992) or their acoustic detection range exceeds the spatial scales examined here (>60 km; we are unaware of published acoustic detection range estimates for fin whales in shallow neritic waters, so this hypothesis is currently difficult to address). Interestingly, fin whale 20-Hz pulse trains are thought to be a reproductive display by males (Croll et al., 2002) like humpback singing, but the association between acoustic and visual occurrence estimates for fin whales was much stronger than that for humpback whales.
The acoustic detection range for the monitored species is much lower than the spatial scales at which we observed significant associations between aerial survey and near real-time acoustic occurrence estimates. Right whales are estimated to have detection ranges of up to 9 km in shallow continental shelf waters (Clark, Brown, & Corkeron, 2010), and humpback whales, producing calls at similar frequencies and source levels as right whales (Au et al., 2006;Clark et al., 2008Clark et al., -2010Thompson, Cummings, & Ha, 1986), likely have a similar detection range. Sei whales produce lower frequency calls at louder source levels (Baumgartner et al., 2008;Newhall, Lin, Lynch, Baumgartner, & Gawarkiewicz, 2012), and Baumgartner et al. (2008) estimated an acoustic detection range 10-15 km. Fin whales produce the loudest and lowest frequency calls of all of the species studied here (Charif, Mellinger, Dunsmore, Fristrup, & Clark, 2002), and may have detection ranges of several tens of kilometres in shallow neritic waters. With detection ranges of 9-15 km for right, humpback and sei whales, why would the best associations between acoustic and visual occurrence estimates be observed at 30-40 km?
While the instantaneous detection range of the buoy may be 10-20 km for these three species, whales move over the time scales of the analysis presented here (e.g. 24-48 hr), so the time and location when they are acoustically detected is rarely the same time and location when they are visually detected. A whale that is calling near the buoy on one day may be 30 km away on the next day when it is detected by the aerial survey. The implications of this are important.
If acoustic detections are to be used for mitigation over time scales of a few days, then the movement of whales must be taken into account. The spatial scale over which there are significant associations between aerial and acoustic occurrence estimates can be thought of as the monitoring range of the acoustic system, which is different from its detection range. We define the monitoring range as the area over which whales that are acoustically detected will move over a specified time scale (see supporting information). It is dependent on short-term (tens of hours to days) movement behaviour, of which we know little for whales, but we have estimated the monitoring range empirically here using associations between acoustic detections and visual sightings.
The near real-time estimates of occurrence from the DMON/ LFDCS buoy were accurate, producing false detection rates of 0% for right, humpback and sei whales, and nearly 0% for fin whales.
The analysis protocol was purposely designed to be conservative to produce low false detection rates for marine mammal mitigation applications at the expense of higher missed detection rates.
There are several U.S. Coast Guard gunnery training ranges near Nomans Land Island, and the DMON/LFDCS buoy was used to deliver near real-time detections of right whales directly to the Coast Guard operations centre in Woods Hole, Massachusetts to aid in minimizing interactions between Coast Guard vessels and right whales during training exercises. In addition to reducing ship strike risks to right whales by postponing training exercises when whales were present, the system saved the Coast Guard time and mobilization costs by reducing the chances that right whales will be encountered during an exercise, which would force the immediate cancellation of the exercise and the return of the training ships to port. We hope to expand the use of the system in the near future for other applications, including mitigating ship strikes in areas heavily trafficked by commercial ships and noise exposure during wind farm construction.

ACK N OWLED G EM ENTS
We thank Annamaria Izzi, Danielle Cholewiak and Genevieve

DATA AVA I L A B I L I T Y
All the data presented here are accessible on Dryad Digital