前言
在上一篇文章《WebRtc中是如何处理音频数据的?》中我讨论了音频数据的处理,这次我们来讨论视频数据是如何处理的。
本文是基于PineAppRtc开源项目
因为一个需求,我们需要将WebRtc发送过来的视频流中转出去,所以就研究一下WebRtc是如何处理视频数据的,于是有了这篇文章。
数据接收
在使用webrtc进行即时通话时,双方连接上后,会根据参数创建一个PeerConnection连接对象,具体代码在PeerConnectionClient类中,这个是需要自己来实现的。这个连接的作用来进行推拉流的。
我们在PeerConnectionClient中可以找到PCObserver,它实现了PeerConnection.Observer这个接口。在它的onAddStream
回调中
if (stream.videoTracks.size() == 1) {
mRemoteVideoTrack = stream.videoTracks.get(0);
mRemoteVideoTrack.setEnabled(mRenderVideo);
for (VideoRenderer.Callbacks remoteRender : mRemoteRenders) {
mRemoteVideoTrack.addRenderer(new VideoRenderer(remoteRender));
}
}
可以看到为remoteVideoTrack
添加了VideoRenderer,这个VideoRenderer就是处理接受到的视频数据的
VideoRenderer的构造函数中传入的是VideoRenderer.Callbacks,它是一个接口,我们以其中一个实现SurfaceViewRenderer为例,它的回调函数renderFrame
代码如下
public void renderFrame(I420Frame frame) {
this.updateFrameDimensionsAndReportEvents(frame);
this.eglRenderer.renderFrame(frame);
}
这个I420Frame就是封装后的接收到的视频数据。
数据绘制
在renderFrame中执行了eglRenderer.renderFrame
开始进行绘制
public void renderFrame(I420Frame frame) {
...
synchronized(this.handlerLock) {
...
synchronized(this.frameLock) {
...
this.pendingFrame = frame;
this.renderThreadHandler.post(this.renderFrameRunnable);
}
}
...
}
将frame赋值给pendingFrame,然后post一个runnable,这个runnable代码如下
private final Runnable renderFrameRunnable = new Runnable() {
public void run() {
EglRenderer.this.renderFrameOnRenderThread();
}
};
可以看到执行了renderFrameOnRenderThread函数:
private void renderFrameOnRenderThread() {
Object var2 = this.frameLock;
I420Frame frame;
synchronized(this.frameLock) {
...
frame = this.pendingFrame;
this.pendingFrame = null;
}
if (this.eglBase != null && this.eglBase.hasSurface()) {
...
int[] yuvTextures = shouldUploadYuvTextures ? this.yuvUploader.uploadYuvData(frame.width, frame.height, frame.yuvStrides, frame.yuvPlanes) : null;
if (shouldRenderFrame) {
GLES20.glClearColor(0.0F, 0.0F, 0.0F, 0.0F);
GLES20.glClear(16384);
if (frame.yuvFrame) {
this.drawer.drawYuv(yuvTextures, drawMatrix, drawnFrameWidth, drawnFrameHeight, 0, 0, this.eglBase.surfaceWidth(), this.eglBase.surfaceHeight());
} else {
this.drawer.drawOes(frame.textureId, drawMatrix, drawnFrameWidth, drawnFrameHeight, 0, 0, this.eglBase.surfaceWidth(), this.eglBase.surfaceHeight());
}
...
}
this.notifyCallbacks(frame, yuvTextures, texMatrix, shouldRenderFrame);
VideoRenderer.renderFrameDone(frame);
} else {
this.logD("Dropping frame - No surface");
VideoRenderer.renderFrameDone(frame);
}
}
将I420Frame加载成int[],然后通过drawer的对应的drawXxx
函数进行绘制.
拦截处理
所以我们如果要自己处理接收的数据,就需要自行实现一个VideoRenderer.Callbacks,将其封装到VideoRenderer中并add到mRemoteVideoTrack
上。
那么还有一个问题,I420Frame如何转成原生数据呢?
我发现VideoRenderer.Callbacks的另外一个实现VideoFileRenderer。如果要写入文件,一定会以原生数据的形式写入的,它的部分代码
public void renderFrame(final I420Frame frame) {
this.renderThreadHandler.post(new Runnable() {
public void run() {
VideoFileRenderer.this.renderFrameOnRenderThread(frame);
}
});
}
private void renderFrameOnRenderThread(I420Frame frame) {
float frameAspectRatio = (float)frame.rotatedWidth() / (float)frame.rotatedHeight();
float[] rotatedSamplingMatrix = RendererCommon.rotateTextureMatrix(frame.samplingMatrix, (float)frame.rotationDegree);
float[] layoutMatrix = RendererCommon.getLayoutMatrix(false, frameAspectRatio, (float)this.outputFileWidth / (float)this.outputFileHeight);
float[] texMatrix = RendererCommon.multiplyMatrices(rotatedSamplingMatrix, layoutMatrix);
try {
ByteBuffer buffer = nativeCreateNativeByteBuffer(this.outputFrameSize);
if (frame.yuvFrame) {
nativeI420Scale(frame.yuvPlanes[0], frame.yuvStrides[0], frame.yuvPlanes[1], frame.yuvStrides[1], frame.yuvPlanes[2], frame.yuvStrides[2], frame.width, frame.height, this.outputFrameBuffer, this.outputFileWidth, this.outputFileHeight);
buffer.put(this.outputFrameBuffer.array(), this.outputFrameBuffer.arrayOffset(), this.outputFrameSize);
} else {
this.yuvConverter.convert(this.outputFrameBuffer, this.outputFileWidth, this.outputFileHeight, this.outputFileWidth, frame.textureId, texMatrix);
...
}
buffer.rewind();
this.rawFrames.add(buffer);
} finally {
VideoRenderer.renderFrameDone(frame);
}
}
可以看到得到的是I420Frame类,这个类里封装里视频数据,是i420格式的,且Y、U、V分别存储,可以看到yuvPlanes
是一个ByteBuffer[],yuvPlanes[0]
是Y,yuvPlanes[1]
是U,yuvPlanes[2]
是V
这些数据我们可能无法直接使用,所以需要进行转换,比如转成NV21格式。
我们知道NV21是YYYYVUVU这种格式,所以可以通过下面这个方法可以将其转成NV21格式的byte数组
public static byte[] convertLineByLine(org.webrtc.VideoRenderer.I420Frame src) {
byte[] bytes = new byte[src.width*src.height*3/2];
int i=0;
for (int row=0; row<src.height; row++) {
for (int col=0; col<src.width; col++) {
bytes[i++] = src.yuvPlanes[0].get(col+row*src.yuvStrides[0]);
}
}
for (int row=0; row<src.height/2; row++) {
for (int col=0; col<src.width/2; col++) {
bytes[i++] = src.yuvPlanes[2].get(col+row*src.yuvStrides[2]);
bytes[i++] = src.yuvPlanes[1].get(col+row*src.yuvStrides[1]);
}
}
return bytes;
}
通过分析可以发现,在WebRtc中传输视频数据的时候用的是i420格式的,当然采集发送时候这个库在底层自动将原始数据转成i420格式;但是接收的数据则不同。如果我们要拿到这些数据进行处理,就需要我们自己进行转码,转到通用的格式后再处理。
数据采集和发送
在使用webrtc进行即时通话时,双方连接上后,会根据参数创建一个PeerConnection连接对象,具体代码在PeerConnectionClient类中,这个是需要自己来实现的。这个连接的作用来进行推拉流的。
然后创建一个MediaStream对象,并添加给PeerConnection
mPeerConnection.addStream(mMediaStream);
这个MediaStream就是处理流的,可以给MediaStream对象添加多个轨道,比如声音轨道、视频轨道
mMediaStream.addTrack(createVideoTrack(mVideoCapturer));
这里mVideoCapturer
是一个VideoCapturer对象,用来处理视频收集的,实际上就是封装了相机
VideoCapturer是一个接口,有很多实现类。这里以CameraCapturer及子类Camera1Capturer为例子
继续看createVideoTrack这个函数
private VideoTrack createVideoTrack(VideoCapturer capturer) {
mVideoSource = mFactory.createVideoSource(capturer);
capturer.startCapture(mVideoWidth, mVideoHeight, mVideoFps);
mLocalVideoTrack = mFactory.createVideoTrack(VIDEO_TRACK_ID, mVideoSource);
mLocalVideoTrack.setEnabled(mRenderVideo);
mLocalVideoTrack.addRenderer(new VideoRenderer(mLocalRender));
return mLocalVideoTrack;
}
可以看到通过createVideoSource
函数将VideoCapturer封装到VideoSource对象中,然后利用VideoSource创建出轨道的VideoTrack。
来看看createVideoSource函数
public VideoSource createVideoSource(VideoCapturer capturer) {
org.webrtc.EglBase.Context eglContext = this.localEglbase == null ? null : this.localEglbase.getEglBaseContext();
SurfaceTextureHelper surfaceTextureHelper = SurfaceTextureHelper.create("VideoCapturerThread", eglContext);
long nativeAndroidVideoTrackSource = nativeCreateVideoSource(this.nativeFactory, surfaceTextureHelper, capturer.isScreencast());
CapturerObserver capturerObserver = new AndroidVideoTrackSourceObserver(nativeAndroidVideoTrackSource);
capturer.initialize(surfaceTextureHelper, ContextUtils.getApplicationContext(), capturerObserver);
return new VideoSource(nativeAndroidVideoTrackSource);
}
可以看到这里新建了一个AndroidVideoTrackSourceObserver对象,它是CaptureObserver接口的实现,然后调用了VideoCapturer的initialize
函数
在CameraCapturer实现的initialize
函数中将AndroidVideoTrackSourceObserver对象赋值给了VideoCapturer的·capturerObserver`属性。
回过头再看看PeerConnectionClient类中,还调用了VideoCapturer的startCapture
函数,看看它在CameraCapturer中的实现
public void startCapture(int width, int height, int framerate) {
Logging.d("CameraCapturer", "startCapture: " + width + "x" + height + "@" + framerate);
if (this.applicationContext == null) {
throw new RuntimeException("CameraCapturer must be initialized before calling startCapture.");
} else {
Object var4 = this.stateLock;
synchronized(this.stateLock) {
if (!this.sessionOpening && this.currentSession == null) {
...
this.createSessionInternal(0, (MediaRecorder)null);
} else {
Logging.w("CameraCapturer", "Session already open");
}
}
}
}
最后执行了createSessionInternal
private void createSessionInternal(int delayMs, final MediaRecorder mediaRecorder) {
this.uiThreadHandler.postDelayed(this.openCameraTimeoutRunnable, (long)(delayMs + 10000));
this.cameraThreadHandler.postDelayed(new Runnable() {
public void run() {
CameraCapturer.this.createCameraSession(CameraCapturer.this.createSessionCallback, CameraCapturer.this.cameraSessionEventsHandler, CameraCapturer.this.applicationContext, CameraCapturer.this.surfaceHelper, mediaRecorder, CameraCapturer.this.cameraName, CameraCapturer.this.width, CameraCapturer.this.height, CameraCapturer.this.framerate);
}
}, (long)delayMs);
}
又执行了createCameraSession,在Camera1Capturer中该函数代码如下
protected void createCameraSession(CreateSessionCallback createSessionCallback, Events events, Context applicationContext, SurfaceTextureHelper surfaceTextureHelper, MediaRecorder mediaRecorder, String cameraName, int width, int height, int framerate) {
Camera1Session.create(createSessionCallback, events, this.captureToTexture || mediaRecorder != null, applicationContext, surfaceTextureHelper, mediaRecorder, Camera1Enumerator.getCameraIndex(cameraName), width, height, framerate);
}
可以看到创建了一个Camera1Session,这个类就是实际操作相机的,在这个类里就看到了熟悉的Camera,在listenForBytebufferFrames
函数中
private void listenForBytebufferFrames() {
this.camera.setPreviewCallbackWithBuffer(new PreviewCallback() {
public void onPreviewFrame(byte[] data, Camera callbackCamera) {
Camera1Session.this.checkIsOnCameraThread();
if (callbackCamera != Camera1Session.this.camera) {
Logging.e("Camera1Session", "Callback from a different camera. This should never happen.");
} else if (Camera1Session.this.state != Camera1Session.SessionState.RUNNING) {
Logging.d("Camera1Session", "Bytebuffer frame captured but camera is no longer running.");
} else {
long captureTimeNs = TimeUnit.MILLISECONDS.toNanos(SystemClock.elapsedRealtime());
if (!Camera1Session.this.firstFrameReported) {
int startTimeMs = (int)TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - Camera1Session.this.constructionTimeNs);
Camera1Session.camera1StartTimeMsHistogram.addSample(startTimeMs);
Camera1Session.this.firstFrameReported = true;
}
Camera1Session.this.events.onByteBufferFrameCaptured(Camera1Session.this, data, Camera1Session.this.captureFormat.width, Camera1Session.this.captureFormat.height, Camera1Session.this.getFrameOrientation(), captureTimeNs);
Camera1Session.this.camera.addCallbackBuffer(data);
}
}
});
}
可以看到在通过预览回调onPreviewFrame拿到视频数据后,调用了events.onByteBufferFrameCaptured
,这个events就是create时传入的,回溯上面的流程可以发现这个events就是CameraCapturer中的cameraSessionEventsHandler
,它的onByteBufferFrameCaptured
函数如下:
public void onByteBufferFrameCaptured(CameraSession session, byte[] data, int width, int height, int rotation, long timestamp) {
CameraCapturer.this.checkIsOnCameraThread();
synchronized(CameraCapturer.this.stateLock) {
if (session != CameraCapturer.this.currentSession) {
Logging.w("CameraCapturer", "onByteBufferFrameCaptured from another session.");
} else {
if (!CameraCapturer.this.firstFrameObserved) {
CameraCapturer.this.eventsHandler.onFirstFrameAvailable();
CameraCapturer.this.firstFrameObserved = true;
}
CameraCapturer.this.cameraStatistics.addFrame();
CameraCapturer.this.capturerObserver.onByteBufferFrameCaptured(data, width, height, rotation, timestamp);
}
}
}
这里调用了capturerObserver.onByteBufferFrameCaptured
,这个capturerObserver就是前面initialize时传入的AndroidVideoTrackSourceObserver对象,它的onByteBufferFrameCaptured
函数
public void onByteBufferFrameCaptured(byte[] data, int width, int height, int rotation, long timeStamp) {
this.nativeOnByteBufferFrameCaptured(this.nativeSource, data, data.length, width, height, rotation, timeStamp);
}
调用了native函数。这样整个流程就结束了,应该在native中对数据进行处理并发送。
其实这里关键就是VideoCapturer,除了CameraCapturer及子类,还有FileVideoCapturer等。
如果我们需要直接发送byte[]原生数据,可以自定义实现一个VideoCapturer,获取他的capturerObserver
变量,主动调用它的onByteBufferFrameCaptured
函数即可。