Performance evaluation of WebRTC-based online consultation platform

: Information technologies give patients the opportunity to communicate with medical professionals remotely. Telemedicine uses these technologies to provide advanced healthcare and medical services. We present a medical online consultation application based on Web Real-Time Communications (WebRTC) technology enabling chat, audio, and video calls. Communication architecture and protocols of the application are explained in detail. Additionally, the user interface of the application is shown via performed calls. The application is tested and evaluated on different network connections (3G, 4G, local, and DSL) and different browsers and mobile operating systems (Android, Chrome, Firefox, Internet Explorer, iOS, Opera, Safari). During calls, communication quality parameters such as round-trip time (RTT) and packet loss, obtained via the WebRTC application programming interface, are analyzed. 3G, 4G, and local connections show low packet losses ( < 1%). Packet losses are high ( > 1%) in Android, Chrome, iOS, Opera, and Safari for DSL connection, but RTT values are low ( < 100 ms) in all different conditions excluding iOS. In the presented application, RTT and packet loss remain lower than 100 ms and 1%, respectively, in various scenarios, indicating good communication quality. RTT and packet loss are related to total time and hang time parameters, which describe the necessary time to establish and to end a call. It is shown that communication quality of the application can simply be measured by analyzing the total time parameter. This enables predictable information for communication quality for WebRTC-based applications without continuously monitoring RTT and packet loss for the first time


Introduction
Healthcare services have become more innovative and patient-centric with the help of information technology. The exchange of information between patients and doctors has become easier with communication technologies and that improves healthcare services and technologies [1][2][3][4][5]. Patients can readily reach doctors and healthcare service providers through communication tools [6]. Moreover, it is possible to share medical data with/among healthcare service providers [7]. As a result of these services, a new medical field called telemedicine has emerged, whereby healthcare providers can remotely diagnose, monitor, and treat diseases via telecommunication tools [8]. Telemedicine aims to provide remote clinical support, overcome geographical barriers, and improve health outcomes by using various types of information and communication technologies [9]. Telemedicine has many uses in the medical field and is mostly referred to as teleradiology [10], telepathology [11], tele-emergency [12], teletrauma [13], telecardiology [14], teledermatology [15], teleophthalmology [16], telepsychology [17], and telesurgery [18]. The use of information technology in medicine is enhancing day-by-day and telemedicine has become a necessity for today's healthcare services [19].
There are many tools available in the telemedicine field and online consultation is one of the key elements in this field that allows to reduce nursing costs, to overcome shortages of the healthcare work force, to maintain better home healthcare, and also to enhance public health, health education, and patient-doctor and interprofessional relationships [20][21][22]. Web Real-Time Communication (WebRTC), an open source project that provides real-time communications to web browsers, is one of the important tools in this regard [23].
WebRTC offers secure peer-to-peer media and direct file sharing between browsers without the need of mediating servers or third-party software [24]. The WebRTC application programming interface (API) is used for developing communication applications with data, audio, and video channels. WebRTC API has three main elements: (i) PeerConnection, (ii) DataChannel, and (iii) MediaStream [25]. PeerConnection establishes direct communication between peers. DataChannel is a data transport service between peers via bidirectional peer-to-peer connection. MediaStream creates audio and video streams and manages their content.
WebRTC has been used for online medical consultation, enabling treatment and rehabilitation [26], and also for diagnosis by transmitting radiology images. For instance, ultrasound images of organs were sent over the WebRTC framework for remote diagnostics [27]. In some cases, WebRTC has been utilized for monitoring and examination of elderly people and physically or mentally challenged patients [22,[28][29][30].
The WebRTC-based architecture was designed for remote and continuous monitoring of seniors, remote medical examination, and emergency intervention [28]. In this architecture, wearable devices were planned to supply necessary information remotely to healthcare personnel via WebRTC DataChannel while online consultation took place. A WebRTC-based application was developed for coordination of paramedics with an incident command center in disaster scenarios [29]. This application works via Google Glass, allowing paramedics hands-free communication. A video conferencing system based on WebRTC was designed for a tele-home monitoring project conducted in different states of Austria [30]. EasyRTC, which is an online toolkit available to implement WebRTC applications, was used for conferencing. WebRTC was used for physical therapy and exercise over a telepresence application [31]. A telepresence exercise platform based on WebRTC technology was developed for elderly women with a high risk of falling. Moreover, real-time communication technology could be integrated with virtual and augmented reality for healthcare services [32].
Here, we present an online consultation application based on native WebRTC technology, and in this work, corresponding communication protocols of the application are explained in detail. The coverage of the application has been tested for different networks, browsers, and mobile systems by evaluating communication quality parameters. In this way, WebRTC performance in different usage scenarios has been revealed for the first time. Moreover, necessary times to establish and to end a call are studied to reveal their effects on communication quality.

WebRTC communication protocols
The WebRTC communication architecture used in our application is shown in Figure 1. In this architecture, both peers need to send necessary protocol information through a socket.io server to establish a connection.
Interactive Connectivity Establishment (ICE) candidates are initiated and identified through the protocol to locate peers [33]. For this purpose, first a Session Traversal Utilities for Network Address Translator (STUN) server is used to identify peer addresses. If the STUN server is successful in identifying addresses, a direct media connection is established between peers. In some cases, the STUN server cannot provide a direct connection to the peer. Then a Traversal Using Relays around Network Address Translator (TURN) server allows obtaining a public address of the peers and transferring media between peers.  Figure 2 shows the communication protocol between peers to establish a call procedure composed of initiation, handshake, ICE negotiation, and hang-up stages. In the initiation stage, first peers log in to the application and they are assigned to different rooms by the socket.io server. These rooms are used to send the communication status of peers. In this way, users are in an Idle step, where they can receive or initiate calls. For the calling process, a peer (caller) starts a call and registers with the room of the second peer (callee) receiving a call. If callee's room is full (i.e. more than two peers are connected) or empty, the calling process is terminated.
If not, the callee receives the incoming call. If the callee accepts the call, the WebRTC communication procedure will be started.
The WebRTC library is used to send/receive media to/from peers * . First, peers start their media streams (video and/or audio). When the streams are ready, the callee creates a peer connection, adds its local stream, creates an offer, and sends this offer to the caller. For chat communications, the proposed application will follow the same procedure by taking users' media streams in ready states without starting the media and the application will start the data channel for transferring text messages. When a caller receives this offer, the caller also creates a peer connection, sets a remote description, adds its local stream, creates the answer, and sends this answer to the callee. When the answer is received, the callee sets the remote descriptions. While a peer connection is set, ICE candidates are initiated. If a local ICE candidate is created, the protocol checks the created offer and adds the remote ICE candidate. If one of these checked items is available, the local ICE candidate is sent to the remote user and the remote user adds this candidate as a remote ICE candidate. When the remote and local streams are ready, ICE connection states are controlled. If the ICE state is connected or completed, the call will be established and communication screens for chat/audio/video calls will be shown to the peers. When the ICE connection is in a checking or new state, the protocol will wait for a while and check the ICE connection state again. If the ICE connection is in another state (disconnected, failed, or closed), the connection will be restarted by closing the peer connection.
If the callee or caller hangs up on the call, the peer connection and data channel will be closed and media streams will be terminated. Furthermore, the caller will exit the callee's room and register with its own room.
Hence, the hang-up procedure (Hang Up Stage) will be completed and peers will be ready to receive or initiate a new call. Communication protocol between peers. The state machine diagram is given for a call procedure composed of initiation, handshake, ICE negotiation, and hang-up stages.

Development of WebRTC application
The user interface of our application is given in Figure 3. After users log in to the application, they can search corresponding users to make chat, audio, and video calls ( Figure 4). We used HTML, CSS, PHP, and JavaScript to develop our online consultation application. WebRTC API adapter.js (version v0.14.0) was utilized to realize WebRTC communication protocols. After developing the web-based application, the code was built for mobile systems (Android and iOS) by using the Apache Cordova development framework. The Cordova iOSRTC plugin enables WebRTC API on iOS. In this way, the same code was utilized for both web and mobile applications. Web application can run on browsers that support WebRTC, such as Chrome, Firefox, or Opera. Moreover, the Temasys Browser Plugin was used to provide WebRTC support for other browsers, such as Safari and Internet Explorer.

Performance analysis
WebRTC-based real-time video, audio, and chat communications were tested for different connection types The performance data of the communication were obtained via GetStats API, which gives access to WebRTC data on Chrome browsers. Using this API, we analyzed (i) round-trip time (RTT), which is the measured elapsed time in milliseconds from the browser's request sending time to the browser's server response receiving time [34]; (ii) packets received, which is the total number of packets received for this connection over the network; (iii) packets lost, which is the total number of packets lost for this connection [35]; and (iv) frame rate (frames per second, fps), only for video communication. We also analyzed packet loss, which is the percentage of packets lost with respect to total received packets. Moreover, total connection time and hang time were measured. Total connection time indicates the time elapsed between the start of users' media and call established states ( Figure 2). Hang time is the time to complete the hang-up stage ( Figure 2).
We tested the performances of different communication types (chat, audio, video) with six different calls.
Each call was 5 min long and during the call performance data were collected at intervals of 5 s. For chat calls, users were sending 20-character-long messages in each second. For video calls, a 640 ×480 sized video at 30 fps was set to be streamed. After 5 min, the call was ended automatically. The Chrome browser was utilized to analyze different connection performances of our application. For the local connection type, the caller and callee were in the same local network. For the other connection types, the caller's connection was changing but the callee connection stayed in DSL. For testing the performance of different browsers and mobile operating systems, the caller's browser or operating system was changed accordingly and, in the meantime, the callee was using the Chrome browser. The DSL connection was utilized for testing different browsers and mobile operating systems. The callee's data were used for all tests and the test data were compared with the one-way ANOVA (analysis of variance) Tukey test. All performance data are available in the supplementary tables and figures.

Results and discussion
The performance data of the application were analyzed for different types of connections, which are 3G, 4G, DSL, and local ( Figure 5). 4G technology has recently gained momentum and has begun to replace 3G technology.
In 2018, 29% and 45% of mobile subscribers uses 3G and 4G networks, respectively. * Although the use of 4G technology is increasing and even 5G technology is being introduced, it is predicted that by 2024, 3G technology will still account for 17% of mobile subscribers globally. Meanwhile, 4G is expected to reach 56% of all mobile subscribers. Each tested connection type has different network architecture features such as data transfer rate, security, access, and transmission terminology [36]  than the other connection types. As shown in Figure 5a, 3G resulted in longer total connection time and it was significantly different from 4G and local networks for chat and video calls. However, for audio calls, total connection time was not significantly different for different connection types. Total connection time reached 1.38 ± 0.24 s for video calls in 3G. On the other hand, this connection time was reduced to 0.97 ± 0.12 s for the local connection. Since there was a huge video media stream in video calls, these calls showed longer total connection times than chat and audio calls. Although 3G showed the longest hang times for each call type (Figure 5b), hang times remained under 0.53 s for different call types. Hang time values of 3G were significantly different from other networks for chat and audio calls, and they were also significantly different from local and DSL networks for video calls. DSL and local networks had smaller hang times than 3G and 4G networks. RTT values had big differences in all networks for all call types (Figure 5c) In the meantime, the local network showed the smallest RTT values for audio call packets, audio packets of video calls, and video packets of video calls, respectively. RTT had maximum values for 3G connections and it was 77.98 ± 38.00 ms for video packets of video calls. This value is well below 100 ms, which is a measure for optimal media quality. * * Figure 6. Results for different browsers and mobile operating systems: (a) total connection time, (b) hang time, (c) RTT, and (d) packet losses shown for different call types in Android, Chrome, Firefox, Internet Explorer, iOS, Opera, and Safari browsers and mobile operating systems. *, **, and *** are used to indicate P <0.05, P <0.01, and P <0.001, respectively.
There was no significant difference in packet loss for audio packets of video calls (Figure 5d). However, packet losses were higher in the DSL connection compared to the other connection types and it was significantly different from the 3G network in audio calls and also significantly different from all other networks for video packets of video calls. Packet loss reached its maximum value of 4.21 ± 3.88% for video packets of video calls in the DSL connection. On the other hand, the 4G network's packet loss remained under 0.01% for audio and video packets of video calls. For different connection types, 3G, which has the lowest Internet connection speed, showed the worst performance in total time, hang time, and RTT. On the contrary, packet loss for video packets was increased 31-fold in DSL, which has the second lowest Internet connection speed, compared to the 3G connection. Packet loss in other connection types remains below 1% , which is good for streaming audio or video. Packet losses in 3G, 4G, and local networks were statistically indifferent. Moreover, average frame rates of video communication were in the interval of 29.4-30 fps for different networks and no statistical difference between networks was found ( Figure S1). Based on these results, the presented application showed good communication quality (RTT <100 ms and packet loss <1%) in different network scenarios. However, video packet losses were statistically high in DSL ( >1%). This value indicates a problem in communication quality for video calls in the DSL network.
Performance analysis of the application was also performed on different browsers and mobile operating systems ( Figure 6). The iOS mobile operating system had the longest total connection times among the others with 1.13 ± 0.5 s, 1.64 ± 0.1 s, and 2.86 ± 1.32 s for chat, audio, and video calls, respectively (Figure 6a).
For chat calls, total time was below 1.5 s for all systems, whereas total times were below 1 s only for Chrome, Firefox, Opera, and Safari. Similar trends in total time values were observed for audio calls, except in iOS and Safari. Total time values for iOS and Safari reached 1.64 s and 1.2 s, respectively. Total times were below 2.11 s for all systems except iOS and it reached 1.19 ± 0.13 s for the Chrome browser, which was its minimum value for video calls. As shown in Figure 6b, the longest hang time was obtained for Internet Explorer during video calls and it was 0.97 ± 0.13 s, whereas the hang time for the other systems stayed between 0.29 and 0.47 s. The RTT value of Internet Explorer was higher than the other systems for audio calls (Figure 6c). This RTT value was 26.97 ± 12.17 ms, which was not statistically different than the RTT value of iOS. RTT values of iOS were higher than those of other systems for video calls, which were 466.03 ± 353.13 ms and 436.06 ± 363.02 ms for audio packets and video packets, respectively (Figure 6c). Therefore, we encountered several video freezing problems during iOS tests that might have been caused by the Cordova iOSRTC plugin. On the other hand, RTT values remained under 53 ms for other systems, assuring good streaming quality. Packet losses remained under 1% for all audio calls and audio packets of all video calls. There was no significant difference between these calls. Video packet loss was under 1% only for Firefox and Internet Explorer (Figure 6d). However, packet losses reached 7.20 ± 1.46% for Opera and it was significantly different from Chrome, Firefox, and Internet Explorer systems for video packets of video calls. The remaining systems had less than 5% packet loss. Furthermore, average frame rates of browsers and mobile operating systems in video communications were in the interval of 29.4-30 fps. Safari had the lowest frame rate, which was also statistically different from other systems except iOS ( Figure S2). With these results, it was revealed that audio streaming showed good quality (RTT <100 ms and packet loss <1%) in all tested systems in the DSL network. iOS with high RTT ( >100 ms) and packet loss ( >1%) values and Android, Chrome, Opera, and Safari with high packet loss ( >100 ms) in DSL networks can affect video streaming quality [37,38]. However, packet losses could be minimized by using different networks, such as 4G (Figure 5d), in order to improve communication quality in browsers and mobile operating systems.
Tests revealed that on average 52.03 MB of data were received during 5-min video calls. During the calls, 640 ×480 sized video at 29.89 fps average was streamed, which consumed 828-fold and 36-fold more data than chat and audio calls, respectively. Although this huge consumption of data could be costly for mobile users, the increased popularity of unlimited data plans may dissipate this problem.
Communication quality can be analyzed through RTT and packet losses [38]. We defined high, medium, and low communication quality as follows: (i) RTT is lower than 50 ms and packet loss is lower than 0.5%; (ii) either RTT is higher than 50 ms and packet loss is lower than 0.5% or RTT is lower than 50 ms and packet loss is higher than 0.5%; (iii) RTT is higher than 50 ms and packet loss is higher than 0.5%. In this regard, 29% and 63.5% of total calls are of high and medium quality, respectively, for video communication, whereas in audio communication, 83.5% and 15% of total calls are of high and medium quality, respectively. Total time and hang time values were examined with respect to video communication quality. When the total time was less than 1 s, high quality communication was reached for all calls ( Table 1). The majority of calls (98%) had at least medium communication quality when the total time was between 1 s and 2 s. In this 1-2 s total time interval, 26%, 72%, and 2% of the calls were of high, medium, and low quality, respectively. Beyond this time interval, the percentage of low quality calls (33.5%) was increased significantly and only 16.5% of calls were of high quality. For hang time values lower than 0.3 s, 25% and 75% of calls were of high and medium quality, respectively ( Table 2). The application was tested with 84 different nonautomated and unsupervised video calls. In these calls, the mean values of total connection time and hang time were found as 6.03 s and 0.55 s, respectively (Figure 7).
In these tests, 6% of the calls had a total time of less than 1 s, indicating high communication quality (Table   1), while 47.6% of the calls had a total time of less than 2 s. Hence, nearly half of the video calls were expected to have a high probability of reaching at least medium communication quality based on the results in Table 1, which indicates that our application has sufficient quality for online consultation.

Conclusion
A WebRTC-based online consultation application was presented, and it was tested in different networks, browsers, and mobile operating systems for the first time to reveal its performance in different usage scenarios. Audio calls showed good communication quality in all tests. On the other hand, video calls with Android, Chrome, iOS, Opera, and Safari in the DSL network had a worsening of communication parameters, such as RTT and packet loss. This indicates communication problems, which degrade communication quality.
For instance, video-freezing problems can be observed in iOS having high RTT ( >100 ms) and packet loss ( >1%) values. Moreover, total time values showed a good relation with communication quality, which can eliminate continuous monitoring of RTT and packet loss parameters. For telemedicine services, good streaming conditions are necessary for uninterrupted services. Test results revealed good communication conditions for interconnection types and interbrowsers. Hence, this presented application can be utilized in various usage scenarios in different networks and browsers. 5G network technology shows significant performance gains in terms of connection reliability, spectral efficiency, system capacity, and transmission range [39]. By the deployment of the 5G network, the application can be examined in the 5G network to reveal the effect of 5G on communication quality. As a future work, the proposed application can be tested with patients and healthcare providers to reveal its usability for online medical consultation. Moreover, ultrahigh resolution images are needed for dermatological diagnosis. By improving video resolution and network quality, WebRTC applications may stream high quality videos that can lead to improved remote diagnosis. It is evident that WebRTC can open new paths in the field of telemedicine, like developing disease-specific medical applications with secure teleconsultation capability.