The latest installment of EEG Video’s series of live, free-to-attend webinars is now available!
On April 29, 2021, we launched our Closed Captioning 101 series with Closed Captioning 101: Live Streaming Solutions. Matt Mello, Sales Associate for EEG, and Bill McLaughlin, VP of Product Development and CTO for EEG, discuss the big impact that virtual events have made on captioning workflows.
Visuals, live streaming, and flexibility have all seen significant changes in today’s virtual event era. EEG’s closed captioning solutions make it easier than ever for event producers, educators, and more to successfully deliver highly accessible events. The result: a better audience experience, and even greater reach for events, meetings, and broadcasts.
Closed Captioning 101: Live Streaming Solutions • April 29, 2021
This webinar featured:
- The impact of virtual events on changing workflows for communications, event production, and education verticals
- Methods for reaching more audiences via virtual events with improved accessibility
- Breaking closed captioning innovations that benefit live streaming
- A live Q&A session
Bill and Matt discussed the following EEG solutions:
- HD492 iCap Encoder
- iCap Connect Model AV650
- Falcon Live Streaming RTMP Encoder
- Alta Software Caption Encoder
- Lexi Automatic Captioning
- iCap Translate
EEG Video is your source for the latest closed captioning news, tips, and advanced techniques. Visit here to learn more about upcoming EEG webinars, plus all our previously-streamed installments!
Regina: Hi everyone, and thank you so much for joining us for today's webinar, Closed Captioning 101: Live Streaming Solutions. My name is Regina Vilenskaya and I'm the Director of Marketing here at EEG. With me on this webinar is Matt Mello, Sales Associate at EEG. For the live Q&A session we’ll be joined by Bill McLaughlin, EEG's VP of Product Development. This is the first webinar of our Closed Captioning 101 series and will focus on solutions for live streaming. We want to provide a thorough introduction to the challenges EEG has helped all sorts of organizations solve, but also make sure we're highlighting the most common needs of our customers as of late.
During today's webinar, Matt will talk all about how virtual events have changed workflows, from visuals to live streaming to flexibility. Matt will cover a range of closed captioning solutions and the features that make it easy for event producers, educators, and more to deliver accessible (and thus successful) events, meetings, and broadcasts.
Now I'm very happy to welcome Matt to kick us off with Closed Captioning 101: Live Streaming Solutions. Welcome, Matt!
Matt: Hi everyone, thank you so much for joining us today. My name is Matt and I'm with the sales team here at EEG. So today'swebinar is going to be focusing primarily on adding live captions to your workflow in a live streaming environment and how live virtual events have changed these workflows for event production, educational outfits, and communications verticals. We're also gonna cover how to reach broader audiences with more accessible events, as well as go over the latest closed captioning advancements for live streaming.
So anyone who's not familiar with EEG, we could be thought of as a one-stop shop for all things captioning. We've been a leading manufacturer of broadcast video equipment for decades, helping customs from nearly all industries get compliant with our products. Our equipment and services can be found in most, if not all, major broadcasting facilities around the country, as well as many production and event companies. EEG has a wide range of solutions for SSDI, IP, and RTMP livestreaming encoding, as well as live caption data creation. If you're creating any live video content, we have all the captioning solutions and expertise to ensure that your concept is accessible and compliant.
And before we go any further, I'd like to share a very exciting announcement. As of this week, EEG has been acquired by Ai-Media. Joining forces with the Ai-Media team will allow EEG to reach new regions of the world and see a greater impact in our captioning solutions. We will continue to provide the high-quality solutions and services that have driven our customer relationships for more than 35 years. You can visit our blog or go to the link shown onscreen for more information here. If you have any questions pertaining to this acquisition, please enter them in the Q&A tool in Zoom and we'll reach out just following the webinar. So now let's get into today's topic, which is closed captioning solutions for live streaming.
Now as I’m sure we all know, there's been a big push since last year to move everything virtual, so we've seen a huge change in how people are getting their content out to their audiences, while also needing to keep their content as accessible as possible. One of the best ways to do this is to add captions to your events. Having live closed captions available is not only an aid for those who are deaf or hard of hearing, but also for anybody who has an easier time by following along by reading rather than just listening. Thanks to virtual events, visuals have now become more important than ever. With all the attention on a single screen, the pressure is on to make events more interactive and entertaining for your audiences. This is especially the case for event producers. Many customers who have transitioned from in-person events to virtual ones have told us just how important it is to carry this over to the live streaming arena. But to do this, you need a way of embedding the closed captions into the video so that viewers at home can have the option to enable them or disable as they please.
HD492 iCap Encoder
So to start off, we're gonna go over an encoder that many in the field are already very familiar with, which is the HD492 iCap Encoder. This has been our flagship encoder for several years now, and it has been used in countless production facilities around the country. The HD492 accepts an SDI input from your video source and sends the audio over iCap to a captioning agency of your choice, or to our Lexi Automatics speech recognition service, which we’ll go over a bit later. Captions are then sent back down over iCap and embedded into the video as VANC data. You can then send the video out as SDI with embedded closed captions, or with captions burned into the video as open captions with the decoding feature. As we move into the future, however, we've seen that some outfits are moving away from 3G 1080p workflows and towards 12G 4K UHD. So we're very excited to say that we're now shipping an encoder for this.
iCap Connect Model AV650
So our newest encoder is called the iCap Connect AV650 and supports native 12G caption encoding and decoding, making it a great outfit for future-proofing any production studio, or for event spaces that are looking to outfit in 4K resolution. Prior to the pandemic, there was a growing interest in UHD for in-person events, and as we hopefully move towards opening up again, this interest has continued to climb. Since the AV650 supports 4K caption encoding and decoding, this means that you can now overlay captions on a 4K display or in ATSC 3.0 for amazing and accessible in-room monitors. The SDI signal with captions can then also be sent to a separate streaming media encoder, and then converted to RTMP for a live streaming workflow. This model may be the perfect fit for an event production company who is looking to expand their audience by having the flexibility to work with broadcast, live stream, and in-room spaces, while having the option to encode also in 4K. Both the HD492 and AV650 are 1RU SDI encoders, which may fit nicely into your existing production rooms without much adjusting at all.
But I’m betting that the reason most of you have tuned into this presentation today is because a lot of this is new to you, and you may not have a working knowledge of broadcast-style video production. Additionally, many organizations that have been doing most things in-person have been forced to quickly adapt to the online world. Most meetings are now held on conferencing platforms, with some of these meetings actually being pushed out to broadcast as well.
Adaptation of streaming has become more widespread due to COVID, and many of the organizations who have not had to use any live streaming before may now have a better understanding of it moving forward and how they can utilize it productively. In addition to event-based companies, many customers in education, communications, and government have had to adopt live streaming workflows as well. Unfortunately oftentimes, these customers will forgo captions all together because they’re unfamiliar with live streaming as a whole, so adding captions might seem like a daunting task. Some people may already be live streaming, but aren’t sure where to start with captioning, and some may be under the impression that their live streaming player of choice doesn't support closed captions at all.
So EEG Cloud is our self-service cloud platform where you’ll find a suite of captioning tools right within your web browser. This includes our Falcon, Lexi, and iCap Translate services. EEG Cloud was created in mind of the growing live streaming environment, and makes adding captioning a very simple and straightforward process.
Over the last year we’ve seen a boom in customers, particularly event producers and educators, who have turned to our cloud site to caption their live meetings. You can manage all of your EEG subscriptions and plans right within the dashboard, as well as sign up for any new services you might be interested in. These services are all available in monthly or annual plans, depending on your company’s preferences.
EEG Cloud also allows you to manage all of your settings for Lexi and iCap Translate, even if they are being used with a hardware encoder. But if used with Falcon, these tools can be accessed and used without adding any additional hardware to your current live streaming workflow. You can create an account for free and take a look around at eegcloud.tv. While there, you’ll notice there are options to trial most of our services, which we highly encourage doing so you can get an idea of how they’ll work for you.
Falcon Live Streaming RTMP Encoder
So one of the primary functions of EEG Cloud is hosting the interface for our Falcon streaming encoder. Falcon accepts any RTMP input, so you have a ton of flexibility with your source video options. We’ve confirmed that most major streaming services will work with Falcon, as long as they can accept an RTMP or HLS input, but a little more on HLS later. Setting up Falcon is a fairly simple process, even for companies who are just getting started, and we can provide all of the documentation and support needed to get up and running with it. This is a perfect solution for anyone looking for an easy and compromise-free solution to live stream captioning.
So in the picture at the top, you can actually see a diagram of how Falcon fits in with your workflow. Falcon sits between the streaming media encoder and the streaming platform. The live video source and program audio are sent to Falcon through RTMP by a streaming media encoder, whether that’s AWS Elemental, Telestream Wirecast, OBS, or something similar. Your program audio is then captioned by a live captioner or by Lexi, and live caption data is returned to Falcon and embedded into the video as 608 data. From Falcon, you can route the stream to the streaming platform of your choice, and can even send the stream to multiple platforms at once if you’d like. Most live streaming platforms will work with Falcon if they can accept an RTMP or HLS stream, and they also support closed caption decoding. One of the most common questions we get with Falcon is something like, “Will Falcon work with blank streaming encoder, and blank player?” And the answer is usually yes, as long as the player can accept a stream and decode the captions. And to emphasize here, using Falcon requires no additional hardware or software to add captions to your live stream.
So up until now, adding captions into an RTMP stream as 608 data has been the best and most efficient way to get captioning into a live stream, as most live streaming players accept RTMP as an input. However, this method works based on US captioning and broadcast standards that significantly show their age on issues like world languages capabilities. With RTMP, you can caption anything in English or Western European languages, but support is very limited for any alphabets outside of that.
We get a lot of requests from customers who want to have accessible events in multiple languages, so we've just released HLS VTT output for Falcon, which is a new system that allows you to caption in essentially any language. As long as the language is codable in Unicode and uses UTF-8 coding to get into your player, and you have a font available for it, you can caption in that language. That means that just about any language can be added as a captioning option.
This new system converts your RTMP feed into an HLS stream, which is already a popular technology for delivering the stream to consumers. But by converting to HLS before you move to the rest of the production system, you can create a caption track that moves out of the old standards and into the new standards like VTT and TTML, which allows for much more flexibility in language choices. There are already a variety of live stream players that support HLS, such as Akamai, Wowza, and JWPlayer. Below you can see screenshots of captions displayed in JWPlayer in both Japanese and Russian. These languages would not be possible with RTMP output only. Now you can open up access into new countries and target audiences that you might not have been able to access before, as it has become easier than ever to add multiple different languages of captions into your livestream at the same time.
So here's a block diagram of how this works. Notice everything is very similar to the last diagram, except the output stream from Falcon has its own URL that can be pulled by your streaming player. You’re still providing an RTMP input to Falcon, but the new HLS output supports a much higher variety of choices. You can still absolutely use the RTMP output of Falcon like we’ve had before, but now there’s much more flexibility for outputs, and you can reach an even larger audience with far more language options than we had available before.
And this slide shows a quick look at what Falcon’s interface looks like. Here is where you’ll be doing most of the configuring for inputs, outputs, and connecting with the captioner you’ll be using. This is also where you can select the HLS option, as well as adding translations with iCap Translate. You’ll also notice a preview window for the source input, as well as the captioned output. This is a very useful tool for making sure that everything looks as it should as it passes through Falcon.
We also have an HTTP version of Falcon available for any platforms that have a separate HTTP uplink specifically for captioning. Let’s use Zoom for an example. Instead of sending an RTMP stream through Falcon and having captions added to it, you have a separate HTTP link that only the caption text is sent to. If you're the presenter or organizer for the meeting or event, you can get that link by going into the CC display at the bottom of the window and retrieving that URL, which is typically unique for each Zoom meeting. So with this method, you can have caption data added directly within a conferencing platform or CDN without any additional hardware.
Alta Software Caption Software
So as we’ve gone over the past few slides, SDI caption encoding can be done with our HD492 and AV650, and RTMP caption encoding can be done by Falcon. But if you’re looking to caption Transport Stream or 2110 video, we also have a product called Alta, which can kind of be thought of as a big brother to Falcon. Alta supports embedding captions in many stream formats such as RTMP, MPEG-TS, SMPTE 2110, and AWS CDI. It can be hosted locally and delivered as a VM image, an AMI, or in a complete turnkey server package. We also have the option for renting hosted Alta instances for event productions. Alta allows for higher bandwidth and more control over your workflow. This could particularly be useful for broadcasters and event producers looking to future-proof or create higher-quality live productions. We’d be happy to discuss Alta further offline here if you think it may be the right product for you.
So with the rise of virtual events, one opportunity that has presented itself is flexibility as it pertains to workflows, to budgets, to reaching wider audiences and even to how you caption. Up to this point, we’ve discussed the ways that you can embed live closed captions into your video, but we haven’t yet discussed the actual means that the caption data is created. A common question we get asked is, “Can I use human captioners with EEG’s encoders?” And the answer is yes, absolutely you can. Captioning agencies can receive the audio from any of the encoding solutions we’ve mentioned today using our iCap network, which most captioners around the country are already very familiar with.
However, as we transition to virtual workflows, many clients are reevaluating their entire operation, including budgets and scheduling. So how do you decide between a captioning agency or an automatic speech recognition system? Well, the answer is you really don’t have to. Depending on the size of your event, your budgetary considerations, and the availability of captioners, you may want to consider either option, and you can freely switch between either with any of our encoders that we’ve discussed so far today.
Lexi Automatic Captioning
So Lexi is our automatic speech recognition system, which is quickly becoming one of the most popular ways to caption in America. We’ve seen a tremendous increase in interest for the service as people are realizing that it far exceeds their expectations of what automatic speech can be. We have made great strides in updating Lexi since it was first introduced four years ago, and we’re always looking for ways to make it even better. Lexi Live is accessible through the same cloud platform as Falcon, so it’s simple to set everything up for your live stream right within the same webpage.
The base accuracy for our newest revision of Lexi is approximately 96%, but that can be improved by making use of Topic Models. There’s no advanced notice needed to start up Lexi, so there will never be any issues with scheduling like there might be with a captioning agency. Currently, the default backend for Lexi is AWS Transcribe but, like I said, we’re always looking to keep Lexi as up to date as possible, so we have it configured in a way that allows us to switch engines when we feel there will be an improvement by doing so.
So Lexi 2.0 is an update that was deployed a few months ago, and it adds several desirable features such as improved accuracy, scheduling, and better Topic Model control. Lexi 2.0 has reduced the word error count by 30-50% for many users, and recognition of punctuation has become greatly improved also. The new scheduling feature means that you can set a time and date for Lexi to run in the future, and you don’t need anyone to start the job manually. This is a great addition for anybody who has recurring or planned events so you can set it ahead of time. Lexi 2.0 is currently available for any existing or new users, and will be the default option when creating a new Lexi job.
And as I was saying before, Lexi’s base accuracy starts at 96%, and you can improve that even further by picking from Core and Topic Models. These allow you to pick from a selection of curated models that pertain to certain topics, so that you can gear Lexi in the right direction when captioning. For example, if you know that your event will feature current news and events, then you can select the “Headline News” model when starting a new Lexi job. This will give Lexi a list of words that it is more likely to hear and then pick from more accurately.
You can also import your own custom words and proper nouns to further increase accuracy for your particular topic. Things like names of nearby towns, people who might be discussed, and other specific vocabulary are very helpful in getting more accurate captions from Lexi.
A great way of displaying how useful Topic Models can be is by looking at the captioning in this webinar. We’ve uploaded many of the words and acronyms that are actively used in our day-to-day vocabulary in the captioning industry, and Lexi has a much better idea of what we are talking about. You can create as many Topic Models as you would like, and choose from any one of them for any upcoming event, so this is an extremely powerful tool that we highly recommend checking out if you plan on using Lexi anytime in the future.
We also have an optional add-on program called Lexi Leash, which is a free Windows application that makes managing your Lexi jobs easier. This tool was created in consideration of organizations who may not be staffed with experts in the captioning and production field, so it might be very beneficial for them to easily track Lexi usage quotas, monitor current jobs, restart similar previous jobs, and prevent any accidental overage charges. If you are already using Lexi, this can be downloaded for free from our website if you’d like to try it out.
And while Lexi only handles the program audio captioning, we’ve also recently gotten a large number of requests specifically for translation, especially after going virtual. Many people have been searching for this feature in their live streams, as they’re looking to reach as large of an audience as possible. iCap Translate allows you to do this while also being very affordable. With iCap Translate, you can add several languages to your video stream at the same time as long as your end destination can support multiple tracks of captions. In relation to live streaming, this has been a common issue among live stream players, but with the recent addition of HLS output for Falcon, you can now add up to six different languages simultaneously and have it displayed with ease, as long as your workflow supports HLS.
With RTMP output, you can translate into English, Spanish, French, Italian, Portuguese, German, Danish, and Maori. However, if using HLS output, the list of supported languages increases drastically. In addition to those supported by RTMP, iCap Translate in HLS allows for languages like Japanese, Chinese, Korean, Russian, Arabic, and many more, and we’re going to see that list expand over time as the technology continues to grow. iCap Translate works by taking existing caption data created either by Lexi or a separate captioning agency, and translating that text into the desired language.
So as mentioned earlier, Lexi and iCap Translate will work with any of our current encoding solutions, whether it's the HD492, AV650, Falcon, or Alta. The source video goes into the encoder, the audio is sent over iCap to our cloud service, and Lexi transcribes the caption data based on the audio. The caption data can then be sent back over iCap into the encoder and embedded into the video. iCap Translate can fit in here by taking existing caption data, created either by Lexi or a separate captioning agency, and translating that text into the desired language. This allows for even more flexibility, as you can use either automatic speech recognition or a human captioner, and translate the captions regardless of the source. All of our encoders are pre-programmed to work with Lexi and iCap Translate, so setting this all up is a very straightforward process.
So if you’re new to live streaming, or any type of event production, and you want to make your events as accessible as possible, we have a solution for you depending on what your workflow looks like. Adding visuals like captioning can greatly increase your audience reach, and give them more ways to interact with your content. And as the live streaming industry grows, we’re looking for more and more ways to add flexibility into your workflow so that adding captioning is as simple as possible. Products like the new AV650, advancements to our Falcon service, and a growing library of languages in iCap Translate are all steps we have taken to allow you to reach your target audience.
So that is going to conclude the presentation portion of this webinar. If you’d like to demo any of the services that I’ve mentioned today, we’d be happy to set that up for you. Just feel free to shoot us an email at email@example.com and we’ll get everything you need to get started. We’d also be happy to answer any pricing-related questions or any specific workflow questions you might have. And that is going to bring us into today’s Q&A portion.
Regina: Thank you very much, Matt. So yeah, if you have any questions and have not yet entered them into the Q&A tool at the bottom of your Zoom window, you are welcome to do that now. At this time, I would like to welcome Bill McLaughlin, EEG's VP of Product Development to join us for the Q&A session.
Bill: Hi, am I live?
Regina: You are, thank you.
Bill: I've already been kind of busy in the text questions.
Regina: Yes, so the first question that I would like you to answer is, Is there a Falcon vendor list that EEG recommends?
Bill: So I assume the question refers to third-party captioners, stenocaptioners, and the like that are qualified at entering data into iCap and Falcon. And really, using Falcon for those captioners is the same as using SDI encoders, using Alta, any other EEG products. And yes, there is a list on our website with about three dozen different established partners and if you want help finding that list, we certainly can email that offline.
Regina: Greg asks, Are there any facilities captioning at 1080p 59.94?
Bill: Yes. So I saw this question and I think Greg also put in a follow-up, saying that he's had a problem with scaling captioned videos from 1080p down to 1080i. And the thing to remember with that is that, since 608 and 708 standard closed caption data comes - you know, there is captions on each frame of the video from a technical perspective.
So when you're doing frame rate conversions or video standard conversions of captioned material, it's quite common for the captions to be damaged in that process. So if your smart frame converter can handle that, but it's fairly common for that not to be handled right. And so if you're able or looking to kind of in certain extra product to fix that, something like one of our 515 Captioning Legalizers or using the legalized function on a 492, essentially what you would be able to do is put the preconversion 1080p video in to one of the video inputs on and put the post conversion 1080i video in a second one. And the captions would be bridged correctly from one of those inputs to the other. So that could help you if you're having a problem with frame rate conversion for the captions.
Regina: Someone asks, Can I use the Falcon encoder with in-house Captioners?
Bill: Well, the Falcon encoder needs to be - you need to connect to it through iCap. But if your in-house captioners are trained to use the iCap software–and that's a free download, they can come use the iCap software–then they'd be able to use that if they're used to working with serial cable or something like that. Since Falcon is a product that's remote in the cloud, you're not going to be able to send it captions using the serial cable, but you can use iCap from any location.
Regina: Mark is asking if Falcon will work just as well with SRT livestream events, because there is a shift underway for many going from RTMP to SRT.
Bill: Yeah, for sure. So Falcon actually currently supports only an RTMP input and what we recommend customers doing SRT events is to use our Alta product since SRT is actually a wrapped MPEG transport stream. So what you can do is you can use our Alta product, which processes transport streams, and if the transport stream is wrapped in SRT, for example, you can use AWS MediaConnect or a lot of other types of solutions to unwrap that from SRT to MPEG transport. And we do have on our roadmap to make the SRT support completely native and be able to terminate that wrapper within the Alta product. The Alta product is licensed on a slightly different model than Falcon, so that's something that might be of interest to kind of look at in a little bit more detail but, essentially, the Alta product would really be the product for SRT streaming at this time.
Regina: What is the maximum captioning output that Falcon can send to a single livestream? Someone is looking to provide multiple languages.
Bill: Yes, so Matt went over this in the webinar a bit, and it can kind of depend how you're embedding the captions and what your player supports. Using the HLS output, we can put up to six languages on a single stream. Using an embedded RTMP caption output, you can do up to four if that's supported by your player. Some players will have additional restrictions that they only read one language of captioning or they only read two languages of captioning. End to end you need to look at what you're able to do, but the theoretical highest number with Falcon is six.
Regina: Vicki says that Matt mentioned word error rate and would like to know what tools or services we used to measure caption or verbatim accuracy.
Bill: Yeah, so I answered a question in the text questions kind of talking about accuracy, too. And it's obviously a much more complicated topic than simply having a single percentage number and there is no methodology that produces the same percentage number across all types of content.
So it's really something that depends on the content that you're putting through, as well as kind of what your measurement methodology is. The two basic measurement methodologies that are considered is a word error rate which will just be kind of how many how many mistakes or omissions out of 100% are in the transcript compared to a perfect transcript, which even that, there could be some matters of opinion involving how much paraphrase to tolerate or involving how to count something like a punctuation or how to count something where perhaps the speaker was really not very clear at all and how to grade that. But that is comparatively rather simple. You could do that in-house simply by creating an accurate transcript and then doing a word-for-word comparison.
In a lot of countries, especially this has become very popular in Canada and in the UK, people will perform what's called an NER analysis, which is a somewhat more sophisticated way of taking the captioning and comparing, really, the meaning of what’s said compared to the meaning of the text captions with different levels of points lost depending on how much the meaning is transformed. Typically, although the rules for NER are public and open source, typically you’ll do that through an actual certified third-party auditor, especially if doing it for compliance reasons. And the standard in, for example, Canada, is to do a 98 NER score, and what we've seen is that Lexi is capable of doing that for what you might consider to be the easier tier of programs; something like a news program, you know, with a pretty consistent speaker. And that’s a point where automatic captions is mostly not reaching that type of NER for more complicated programs, for example, something that's like a live sports commentary with multiple commentators and maybe a lot of excitement and background noise and etcetera.
So it's a pretty exciting time where a lot of this is really improving all the time, so it's a moving target to simply say what percentage accuracy is Lexi, but hopefully that gives you a flavor of some of the work that we're doing.
Regina: Thank you. And Michael asks, In the cloud RTMP to HLS workflow, does the cloud transcode the single RTMP video resolution into multiple resolutions needed for different devices, or is there only a single HLS output with no video transcoding?
Bill: The second one is currently correct. EEG is producing a non-transcoded segmented HLS in the bitrate that you upload. So yeah, for a lot of scenarios you're going to actually feed that through, perhaps, and do adaptive bitrates through a separate transcoder the same way you would with RTMP. So we're not necessarily filling the entire role that a lot of times platforms that go from RTMP to HLS do. We definitely do have some interesting roadmap features on that as well, where the goal is really to make it so that Falcon can do a lot of interesting things in your streaming tool kit and kind of not be as dependent on other platforms for some of the things that you'd like to do and some of the things that you'd like to see in your players with captioning, especially.
Regina: David asks if we have experience with captioning live sporting events.
Bill: Well yeah, absolutely. I mean, of course, EEG - our encoders are at a lot of the major sports networks and have been doing that for many years. Live sporting using automatic captioning is something that, like I said before, is a somewhat higher degree of difficulty than a lot of types of news or enterprise-style business presentations. Sports tends to have a lot of jargon, a lot of names that need to be trained. So there is an interesting degree of difficulty problem there.
We do have some customers that are using Lexi for live sports events, but that's something where it’s probably not what you're seeing on, say, ESPN or something like that, where really the best-quality results, that’s still usually in the domain of a human stenocaptioner.
Regina: TJ asks if there is an option to have an on-premises auto-translation (so Lexi) that doesn't use the cloud.
Bill: I’m going to throw this one to Matt.
Matt: Yes, so there is actually a 1RU unit called Lexi Local that we have that does this on-premise, and you can hook it into one of our encoders and have captioning done without any Internet connection required at all. So it's great for things like anywhere you need extra security and you don't like the audio going over the cloud. So yeah, Lexi Local would be the product you'd be looking for in that scenario.
Regina: Is there a best approach for inserting captions into pre-recorded videos that will then be streamed out live, specifically in prores formats from AJA players rather than the more standard MPEG-2?
Bill: We can do that with our CCPlay FilePro software, which wasn't covered in this webinar, but you can find information about that on our website. And it's a post-production caption stitching tool that will take captions files of various formats and will embed them into MPEG transport stream files or MXF files or prores-style files. I think prores, you need to make sure you're running that in a wrapper like an MOV wrapper that will hold the captions. But FilePro will be able to embed your captions into pretty much anything that could hold captions.
Regina: Bob asks if there is a way to force the look of captions in streaming players, such as font words per line, or number of lines. And If yes, if that is a Lexi or Falcon feature.
Bill: It depends, because Lexi does have options for you to select the things that are kinda the classic caption parameters that are in embedded captions, namely, like the positioning of the captions on the screen, the number of rows on the screen, the number of letters per row. Players will sometimes vary in how much they respect that information. So some players really take the captions and they understand them as text only, and they format them in a way that the player kind of decides is best, which is not necessarily all bad, because the standards for embedded captioning, they originally came from traditional TV where you were working with kind of an unknown large screen size and shape. And with streaming, obviously responsiveness to different browser window sizes, everything from a smart TV to a phone, means that the same type of captions display is not going to be perfectly readable across all of your viewers’ devices all the time.
So because of that,there's often a need, kind of, for the players to reformat a little bit, so some players will display more literally what you send, either through Lexi or through a human live captioner, who also has control over that kind of thing when you're working with them, so you could ask your caption provider to do a certain number of rows onscreen or a certain style. And the player may respect that in some situations, it may wind up needing to scale the words to look a different way in different situations, and some of the players then will also have local user settings where the user can independently, without any help from the captions supplier, change the size, change the color, change the font, so a lot of that behavior can be player-dependent. But there are some settings to start with, either in Lexi or talking to your third-party captioners.
Regina: Jesse asks if Vimeo will be added as a standard streaming destination.
Matt: My understanding that Vimeo is already - I do believe it works with Falcon. I've had a couple of clients that have said that that's their destination platform to be used with Falcon, so my understanding is it does currently work. There may be some configuration needed to get it working, but I do believe it works.
Bill: Yeah, I'm looking for the question in the text window, but I think the question kind of referred to also the fact that some of the most common platforms like Facebook, like Twitch, we have a preset that makes it easier to switch to stream to them, and you don't need to type the whole URL and remember it yourself. So yeah, I think Vimeo would be a very good add for that. We do have a lot of customers doing Falcon on Vimeo. Sometimes the streams to Vimeo are a little bit more mysterious than I'd like them to be, but we do. It's a very popular platform and a lot of customers are having success with that, so I think that's a good point. We should add Vimeo to the standards.
Regina: How well does Lexi handle accents and English, especially those associated with small ethnic communities?
Bill: We've seen all of it over time, and it depends. I think Lexi is reasonably flexible to different accents. However, you probably want to apply a little bit of the human judgment test on that. Clearly there are speakers who many people not from their region might have considerable difficulty understanding. It's probably fair to say that, in general, automatic captioning doesn't understand people's accents better than other humans do.
So really, if other people would tend to understand the person quite well, I think you'll have good results with Lexi. If it's a risk that a lot of the audience members might struggle, then it's probably going to be a risk that the AI captioning is going to struggle, too. I mean, in a case like that, probably the best thing you could do would be really to be working with a human-based captioners who was familiar with that accent or dialect and was really able to understand it, hopefully better than what you might think of as the average outside-the-community member of the audience. And in that sense, the captions could be helpful for everybody.
Regina: Can Falcon be used to send multiple language captions to an external HTTPS endpoint and what is the caption format used in this protocol?
Bill: Yes, definitely. That would be the HTTP version of the captioning for Falcon. So on EEG Cloud there is a separate menu for HTTP Falcon, and that's what’s actually used for things like Zoom captioning, where it only takes an HTTP endpoint. And there's a couple of different data payload styles that Falcon will support to upload in HTTP so it's not just one format. There's a basic generic one that some customers have used with their own integrations, and that's basically going to be like an XML-formatted payload with a couple of things involving timestamp and kind of metadata about the text and then your basic UTF-8 text itself.
So if you have developers looking to get in touch regarding what's there, I definitely encourage you to do that and we can send you payload documentation.
Regina: Jennifer asks if Falcon can be used in Zoom, and I know this is something that we often cover, but if you could just reiterate the process of what we do to make sure that our webinars are live captioning.
Bill: Right, so props to Wes Long for being our AV guy on this today and during all our webinars. What we do is we do use HTTP-style Falcon to send the captions to Zoom, and we also need to get the audio to Lexi. If you were using, say, a human captioner who was going to actually appear and listen in your meeting, you wouldn't necessarily need that step. But if you want to get the audio through iCap rather than through any other means, then we have a program called iCap Webcast that, say, a free program that works with Falcon.
And what you can do with iCap Webcast is you can take an audio source from your computer, like from your desktop or from an external connection into your computer, and you can stream that out as iCap audio, which then means you can use kind of external sources of audio out in the universe as though they were the audio going directly through, for example, the RTMP stream in Falcon or on the SDI video caption encoder.
So it kind of provides you with a connection to generate your own iCap-compatible audio source from kind of generic third-party software or devices, so that's how we get the audio in. So the audio is basically the mixed audio of the participants that you're hearing. And at that point, the captions are generated and sent. through HTTP Falcon.
Regina: Hugh asks if latency values are decreased for live captioning with the integration of AI text generation.
Bill: That can depend. Pretty much something like - with Lexi, something like about a 3- to 4-second delay is typical, which is also true broadly of stenocaptioning-based approaches. Sometimes other approaches like with voice writing, which is kind of human-assisted, respeaking, and automatic captioning kind of mixed together. Sometimes delays can be longer using methodologies like that. Sometimes when caption delays are long it's also because there's too much delay in the audio path like, for example, you wouldn't want to be captioning a sports broadcast on TV that you were receiving the audio from your satellite cable feed, because that's going to be several seconds delayed relative to where you're sending the captions back to, and that's going to get added into the caption delay.
So the engineering of the audio on that's a little important. For example, you also wouldn't want to take a live stream that was - maybe it was 15 seconds behind real-time. You wouldn't want to have the captioner be listening to that 15-second-behind real-time feed through, say, YouTube or something, and then sending captions back. You need to make sure the captioner has a low-latency source of audio that’s going to match where the delay is at the point where the captions are injected. So there can be a lot of factors in delay. Broadly, I do think the Lexi solution is one of the lower sources of caption delay that you’ll see, but there can be a lot of factors.
Regina: So this concludes our webinar. Thank you all for your questions and huge thank you to Matt, Bill, and Wes for participating in today's webinar. If we didn't get to your questions, we will be in touch with you following the event. But if you have any questions about EEG, about any of the topics that we cover today, anything in general, you are welcome to reach out to us at firstname.lastname@example.org.
Matt: I appreciate all the participation. It’s great. I’m glad to see that we were able to connect with everybody.
Matt: Bye everyone, thanks.