WebRTC视频采集、编码和发送过程

Dianthe ·
更新时间:2024-11-11
· 789 次阅读

目录

一、时间戳定义

1、 NTP时间

2、本地时间

二、摄像头采集、时间戳设置以及数据传递过程

1、传递至编码器

一、时间戳定义

首先,需要罗列下代码中对时间计算的定义,便于后面阅读代码有更好的理解思路。

1、 NTP时间
NtpTime RealTimeClock::CurrentNtpTime() //获取从1900-01-01 00:00.00到当前时刻经过的时间
int64_t RealTimeClock::CurrentNtpInMilliseconds() //获取从1900-01-01 00:00.00到当前时刻经过的毫秒数,ms
int64_t rtc::TimeUTCMicros() //获取从1970-01-01 00:00.00到当前时刻经过的时间,us
int64_t rtc::TimeUTCMillis() //获取从1970-01-01 00:00.00到当前时刻经过的时间,ms
int64_t NtpOffsetMsCalledOnce() //获取ntp时间与本机时间的差值,ms
int64_t NtpOffsetMs() //同NtpOffsetMsCalledOnce()
NtpTime TimeMicrosToNtp(int64_t time_us) //转换本机时间为ntp时间
2、本地时间

从系统启动这一刻起开始计时,不受系统时间被用户改变的影响。

int64_t rtc::TimeMillis() //获取毫秒 ms
int64_t rtc::TimeMicros() //获取微秒 us
int64_t rtc::TimeNanos()  //获取纳秒 ns
int64_t RealTimeClock::TimeInMilliseconds() //获取毫秒 ms
int64_t RealTimeClock::TimeInMicroseconds() //获取微秒 us
二、摄像头采集、时间戳设置以及数据传递过程

VideoCaptureImpl是视频采集的实现类,各个平台都会实现它的子类,子类中会做平台相关的具体实现。子类中采集到的Frame数据都是通过VideoCaptureImpl::IncomingFrame传递进来。如Android平台具体实现的子类为VideoCaptureAndroid,Linux平台为VideoCaptureModuleV4L2。

下面以Linux平台为例:

VideoCaptureModuleV4L2采集到数据之后通过如下接口返回:

int32_t VideoCaptureImpl::IncomingFrame(
    uint8_t* videoFrame,
    int32_t videoFrameLength,
    const VideoCaptureCapability& frameInfo,
    int64_t captureTime/*=0*/) // must be specified in the NTP time format in milliseconds.

上述接口中captureTime若有值,必须为NTP时间,VideoCaptureModuleV4L2在调用时未传参,因此使用默认值0。

设置frame的时间戳timestamp_us_,并回调至接收者

VideoCaptureImpl::IncomingFrame(captureTime = 0) // captureTime在VideoCaptureModuleV4L2下,传入了默认值0 { captureFrame.set_timestamp_ms(rtc::TimeMillis())//设置此帧的时间戳,为本地时间;timestamp_rtp_和ntp_time_ms_还未赋值,都为0; { VideoCaptureImpl::DeliverCapturedFrame(captureFrame)// 传递采集的视频帧 { _dataCallBack->OnFrame(captureFrame);//将帧回调出去 } } } 1、传递至编码器 void VideoStreamEncoder::OnFrame(const VideoFrame& video_frame)

设置frame的ntp时间ntp_time_ms_

// Capture time may come from clock with an offset and drift from clock_. int64_t capture_ntp_time_ms; if (video_frame.ntp_time_ms() > 0) {//值为0,不会进入 capture_ntp_time_ms = video_frame.ntp_time_ms(); } else if (video_frame.render_time_ms() != 0) {//render_time_ms由timestamp_us_换算过来,本地时间。在采集的时候已经赋值 capture_ntp_time_ms = video_frame.render_time_ms() + delta_ntp_internal_ms_; } else { capture_ntp_time_ms = current_time_ms + delta_ntp_internal_ms_; } incoming_frame.set_ntp_time_ms(capture_ntp_time_ms);

delta_ntp_internal_ms_的值,在类对象构造函数内进行初始化,为ntp时间与本地时间的差值:

delta_ntp_internal_ms_(clock_->CurrentNtpInMilliseconds() - clock_->TimeInMilliseconds())

设置frame的rtp时间timestamp_rtp_

// Convert NTP time, in ms, to RTP timestamp. const int kMsToRtpTimestamp = 90; incoming_frame.set_timestamp( kMsToRtpTimestamp * static_cast(incoming_frame.ntp_time_ms()));

至此,此帧的渲染时间戳(timestamp_us_)、采集ntp时间(ntp_time_ms_)和rtp时间戳(timestamp_rtp_)都有值。他们都是表示视频的时间戳,只是不同的表示方式。

忽略编码出现拥塞而丢帧的情况,视频帧将会传递至MaybeEncodeVideoFrame(video_frame)进行编码。

void VideoStreamEncoder::MaybeEncodeVideoFrame(const VideoFrame& video_frame, int64_t time_when_posted_us) { // skip other code EncodeVideoFrame(video_frame, time_when_posted_us); } void VideoStreamEncoder::EncodeVideoFrame(const VideoFrame& video_frame, int64_t time_when_posted_us) { // skip other code VideoFrame out_frame(video_frame); encoder_->Encode(out_frame, &next_frame_types_); }

encoder_的创建,跟踪代码,是由InternalEncoderFactory创建

std::unique_ptr InternalEncoderFactory::CreateVideoEncoder( const SdpVideoFormat& format) { if (absl::EqualsIgnoreCase(format.name, cricket::kVp8CodecName)) return VP8Encoder::Create(); if (absl::EqualsIgnoreCase(format.name, cricket::kVp9CodecName)) return VP9Encoder::Create(cricket::VideoCodec(format)); if (absl::EqualsIgnoreCase(format.name, cricket::kH264CodecName)) return H264Encoder::Create(cricket::VideoCodec(format)); if (kIsLibaomAv1EncoderSupported && absl::EqualsIgnoreCase(format.name, cricket::kAv1CodecName)) return CreateLibaomAv1Encoder(); RTC_LOG(LS_ERROR) << "Trying to created encoder of unsupported format " << format.name; return nullptr; } std::unique_ptr H264Encoder::Create( const cricket::VideoCodec& codec) { RTC_DCHECK(H264Encoder::IsSupported()); #if defined(WEBRTC_USE_H264) RTC_CHECK(g_rtc_use_h264); RTC_LOG(LS_INFO) << "Creating H264EncoderImpl."; return std::make_unique(codec); #else RTC_NOTREACHED(); return nullptr; #endif }

以编码h264为例,编码过程在如下函数中进行:

int32_t H264EncoderImpl::Encode( const VideoFrame& input_frame, const std::vector* frame_types) { rtc::scoped_refptr frame_buffer = input_frame.video_frame_buffer()->ToI420(); // Encode image for each layer. for (size_t i = 0; i StrideY(); pictures_[i].iStride[1] = frame_buffer->StrideU(); pictures_[i].iStride[2] = frame_buffer->StrideV(); pictures_[i].pData[0] = const_cast(frame_buffer->DataY()); pictures_[i].pData[1] = const_cast(frame_buffer->DataU()); pictures_[i].pData[2] = const_cast(frame_buffer->DataV()); } else { // skip the code } // Encode! encoders_[i]->EncodeFrame(&pictures_[i], &info); encoded_images_[i]._encodedWidth = configurations_[i].width; encoded_images_[i]._encodedHeight = configurations_[i].height; encoded_images_[i].SetTimestamp(input_frame.timestamp());//设置rtp时间timestamp_rtp_。capture_time_ms_未进行设置,默认为0 encoded_images_[i]._frameType = ConvertToVideoFrameType(info.eFrameType); encoded_images_[i].SetSpatialIndex(configurations_[i].simulcast_idx); // Split encoded image up into fragments. This also updates // |encoded_image_|. // 编码后,编码数据保存在info中,RtpFragmentize将编码数据拷贝到encoded_images_[i]中,并将其中的nalu信息统计在frag_header内 RTPFragmentationHeader frag_header; RtpFragmentize(&encoded_images_[i], &info, &frag_header); // 编码成功后,将数据回调出去,接收者即为VideoStreamEncoder encoded_image_callback_->OnEncodedImage(encoded_images_[i], &codec_specific, &frag_header); } }

回调回VideoStreamEncoder

EncodedImageCallback::Result VideoStreamEncoder::OnEncodedImage( const EncodedImage& encoded_image, const CodecSpecificInfo* codec_specific_info, const RTPFragmentationHeader* fragmentation) { EncodedImageCallback::Result result = sink_->OnEncodedImage( image_copy, codec_info_copy ? codec_info_copy.get() : codec_specific_info, fragmentation_copy ? fragmentation_copy.get() : fragmentation); }

回调至VideoSendStreamImpl,由rtp_video_sender_进行数据包的封装和发送

EncodedImageCallback::Result VideoSendStreamImpl::OnEncodedImage( const EncodedImage& encoded_image, const CodecSpecificInfo* codec_specific_info, const RTPFragmentationHeader* fragmentation) { EncodedImageCallback::Result result(EncodedImageCallback::Result::OK); result = rtp_video_sender_->OnEncodedImage(encoded_image, codec_specific_info, fragmentation); }

RtpVideoSender进行rtp包的发送和rtcp sr包的发送

EncodedImageCallback::Result RtpVideoSender::OnEncodedImage( const EncodedImage& encoded_image, const CodecSpecificInfo* codec_specific_info, const RTPFragmentationHeader* fragmentation) { // 计算rtp时间戳。需要添加StartTimestamp的增量,StartTimestamp是默认值是一个随机数 uint32_t rtp_timestamp = encoded_image.Timestamp() + rtp_streams_[stream_index].rtp_rtcp->StartTimestamp(); // RTCPSender has it's own copy of the timestamp offset, added in // RTCPSender::BuildSR, hence we must not add the in the offset for this call. // TODO(nisse): Delete RTCPSender:timestamp_offset_, and see if we can confine // knowledge of the offset to a single place. // RTCPSender内部已经存在一份timestamp offset,在OnSendingRtpFrame传入Timestamp的时候 // 无需添加offset。 // TODO: 删除RTCPSender:timestamp_offset_,限制只在一处放置此offset值 if (!rtp_streams_[stream_index].rtp_rtcp->OnSendingRtpFrame( encoded_image.Timestamp(), encoded_image.capture_time_ms_,// 未发现capture_time_ms_赋值处 rtp_config_.payload_type, encoded_image._frameType == VideoFrameType::kVideoFrameKey)) { // The payload router could be active but this module isn't sending. return Result(Result::ERROR_SEND_FAILED); } bool send_result = rtp_streams_[stream_index].sender_video->SendEncodedImage( rtp_config_.payload_type, codec_type_, rtp_timestamp, encoded_image, fragmentation, params_[stream_index].GetRtpVideoHeader( encoded_image, codec_specific_info, shared_frame_id_), expected_retransmission_time_ms); } bool RTPSenderVideo::SendEncodedImage( int payload_type, absl::optional codec_type, uint32_t rtp_timestamp, const EncodedImage& encoded_image, const RTPFragmentationHeader* fragmentation, RTPVideoHeader video_header, absl::optional expected_retransmission_time_ms) { return SendVideo(payload_type, codec_type, rtp_timestamp, encoded_image.capture_time_ms_, encoded_image, fragmentation, video_header, expected_retransmission_time_ms); } bool RTPSenderVideo::SendVideo( int payload_type, absl::optional codec_type, uint32_t rtp_timestamp, int64_t capture_time_ms, rtc::ArrayView payload, const RTPFragmentationHeader* fragmentation, RTPVideoHeader video_header, absl::optional expected_retransmission_time_ms) { std::unique_ptr single_packet = rtp_sender_->AllocatePacket(); RTC_DCHECK_LE(packet_capacity, single_packet->capacity()); single_packet->SetPayloadType(payload_type);//设置pt single_packet->SetTimestamp(rtp_timestamp);//设置时间戳 single_packet->set_capture_time_ms(capture_time_ms); // skip other code bool first_frame = first_frame_sent_(); std::vector<std::unique_ptr> rtp_packets; for (size_t i = 0; i < num_packets; ++i) { RtpPacketToSend* packet; int expected_payload_capacity; // Choose right packet template: if (num_packets == 1) { packet = std::move(single_packet); expected_payload_capacity = limits.max_payload_len - limits.single_packet_reduction_len; } else if (i == 0) { packet = std::move(first_packet); expected_payload_capacity = limits.max_payload_len - limits.first_packet_reduction_len; } else if (i == num_packets - 1) { packet = std::move(last_packet); expected_payload_capacity = limits.max_payload_len - limits.last_packet_reduction_len; } else { packet = std::make_unique(*middle_packet); expected_payload_capacity = limits.max_payload_len; } packet->set_first_packet_of_frame(i == 0); if (!packetizer->NextPacket(packet.get()))// RtpPacketizerH264,取出一个数据包,payload填入packet中 return false; RTC_DCHECK_LE(packet->payload_size(), expected_payload_capacity); if (!rtp_sender_->AssignSequenceNumber(packet.get()))// 设置sequence number return false; // No FEC protection for upper temporal layers, if used. bool protect_packet = temporal_id == 0 || temporal_id == kNoTemporalIdx; packet->set_allow_retransmission(allow_retransmission); // Put packetization finish timestamp into extension. if (packet->HasExtension()) { packet->set_packetization_finish_time_ms(clock_->TimeInMilliseconds()); } // fec 逻辑 if (protect_packet && fec_generator_) { if (red_enabled() && exclude_transport_sequence_number_from_fec_experiment_) { // See comments at the top of the file why experiment // "WebRTC-kExcludeTransportSequenceNumberFromFec" is needed in // conjunction with datagram transport. // TODO(sukhanov): We may also need to implement it for flexfec_sender // if we decide to keep this approach in the future. uint16_t transport_senquence_number; if (packet->GetExtension( &transport_senquence_number)) { if (!packet->RemoveExtension(webrtc::TransportSequenceNumber::kId)) { RTC_NOTREACHED() << "Failed to remove transport sequence number, packet=" <ToString(); } } } fec_generator_->AddPacketAndGenerateFec(*packet); } if (red_enabled()) { // 发送冗余包 std::unique_ptr red_packet(new RtpPacketToSend(*packet)); BuildRedPayload(*packet, red_packet.get()); red_packet->SetPayloadType(*red_payload_type_); // Send |red_packet| instead of |packet| for allocated sequence number. red_packet->set_packet_type(RtpPacketMediaType::kVideo); red_packet->set_allow_retransmission(packet->allow_retransmission()); rtp_packets.emplace_back(std::move(red_packet)); } else { // 发送原始包 packet->set_packet_type(RtpPacketMediaType::kVideo); rtp_packets.emplace_back(std::move(packet)); } if (first_frame) { if (i == 0) { RTC_LOG(LS_INFO) << "Sent first RTP packet of the first video frame (pre-pacer)"; } if (i == num_packets - 1) { RTC_LOG(LS_INFO) <GetFecPackets(); // TODO(bugs.webrtc.org/11340): Move sequence number assignment into // UlpfecGenerator. const bool generate_sequence_numbers = !fec_generator_->FecSsrc(); for (auto& fec_packet : fec_packets) { if (generate_sequence_numbers) { rtp_sender_->AssignSequenceNumber(fec_packet.get()); } rtp_packets.emplace_back(std::move(fec_packet)); } } // 发送rtp包 LogAndSendToNetwork(std::move(rtp_packets), unpacketized_payload_size); }

最终rtp发送

void RtpSenderEgress::SendPacket(RtpPacketToSend* packet, const PacedPacketInfo& pacing_info) { const bool send_success = SendPacketToNetwork(*packet, options, pacing_info); } bool RtpSenderEgress::SendPacketToNetwork(const RtpPacketToSend& packet, const PacketOptions& options, const PacedPacketInfo& pacing_info) { int bytes_sent = -1; if (transport_) { UpdateRtpOverhead(packet); bytes_sent = transport_->SendRtp(packet.data(), packet.size(), options) ? static_cast(packet.size()) : -1; if (event_log_ && bytes_sent > 0) { event_log_->Log(std::make_unique( packet, pacing_info.probe_cluster_id)); } } }

对于Rtcp的SR包,其构建过程如下:

std::unique_ptr RTCPSender::BuildSR(const RtcpContext& ctx) { // Timestamp shouldn't be estimated before first media frame. RTC_DCHECK_GE(last_frame_capture_time_ms_, 0); // The timestamp of this RTCP packet should be estimated as the timestamp of // the frame being captured at this moment. We are calculating that // timestamp as the last frame's timestamp + the time since the last frame // was captured. int rtp_rate = rtp_clock_rates_khz_[last_payload_type_]; if (rtp_rate SetSenderSsrc(ssrc_); report->SetNtp(TimeMicrosToNtp(ctx.now_us_));//由当前本机时间转换出ntp时间。生成此SR包的ntp时间 report->SetRtpTimestamp(rtp_timestamp);//最后一个包的rtp_timestamp时间+offset增量+经过的时间得到 report->SetPacketCount(ctx.feedback_state_.packets_sent); report->SetOctetCount(ctx.feedback_state_.media_bytes_sent); report->SetReportBlocks(CreateReportBlocks(ctx.feedback_state_)); return std::unique_ptr(report); }

以下非最新代码
  --> 设置Frame的render_time_ms
      1.由于传入的captureTime值为0,设置为本机时间戳
      captureFrame.set_render_time_ms(TickTime::MillisecondTimestamp())
      2.否则,设置为NTP值减去(NTP值与本机时间戳的差值)

      类对象在构造函数内会初始化NTP值与本机时间戳的差值:

      delta_ntp_internal_ms_(
          Clock::GetRealTimeClock()->CurrentNtpInMilliseconds() - TickTime::MillisecondTimestamp())

      captureFrame.set_render_time_ms(capture_time - delta_ntp_internal_ms_);

      可以看出计算得到的值其实就是本机时间戳

  --> last_capture_time_ = captureFrame.render_time_ms();保存最新的采集时间戳,本地时间戳
   --> ViECapturer::OnIncomingCapturedFrame(I420VideoFrame& video_frame)
   --> // Make sure we render this frame earlier since we know the render time set
      // is slightly off since it's being set when the frame has been received from

      // the camera, and not when the camera actually captured the frame.

      去除采集的延迟,Android为190ms

video_frame.set_render_time_ms(video_frame.render_time_ms() - FrameDelay());

三、视频帧从采集到编码过程

ViECapturer::ViECaptureProcess()
 --> ViECapturer::DeliverI420Frame(I420VideoFrame* video_frame)
  --> ViEFrameProviderBase::DeliverFrame()
   --> ViERenderer::DeliverFrame() 视频回显窗口
   --> ViEEncoder::DeliverFrame()传递到编码器
    --> Convert render time, in ms, to RTP timestamp.
      const int kMsToRtpTimestamp = 90;
      const uint32_t time_stamp = kMsToRtpTimestamp * static_cast(video_frame->render_time_ms());
      video_frame->set_timestamp(time_stamp);
    --> VideoCodingModuleImpl::AddVideoFrame()
     --> VideoSender::AddVideoFrame
      --> VCMGenericEncoder::Encode
       --> VideoEncoder::Encode() 对于VP8,这里应该是VideoEncoder子类VP8Encoder
                         对于H264,这里应该是VideoEncoder子类H264Encoder
       --> VCMEncodedFrameCallback::Encoded()
        --> VCMPacketizationCallback::SendData()
   --> ViEEncoder::SendData()


作者:lincai2018



编码 webrtc

需要 登录 后方可回复, 如果你还没有账号请 注册新账号