API¶

fileio¶

mmcv.fileio.load(file, file_format=None, **kwargs)[source]¶

Load data from json/yaml/pickle files.

This method provides a unified api for loading data from serialized files.

Parameters:	file (str or `Path` or file-like object) – Filename or a file-like object. file_format (str, optional) – If not specified, the file format will be inferred from the file extension, otherwise use the specified one. Currently supported formats include “json”, “yaml/yml” and “pickle/pkl”.
Returns:	The content from the file.

mmcv.fileio.dump(obj, file=None, file_format=None, **kwargs)[source]¶

Dump data to json/yaml/pickle strings or files.

This method provides a unified api for dumping data as strings or to files, and also supports custom arguments for each file format.

Parameters:	obj (any) – The python object to be dumped. file (str or `Path` or file-like object, optional) – If not specified, then the object is dump to a str, otherwise to a file specified by the filename or file-like object. file_format (str, optional) – Same as `load()`.
Returns:	True for success, False otherwise.
Return type:	bool

mmcv.fileio.list_from_file(filename, prefix='', offset=0, max_num=0)[source]¶

Load a text file and parse the content as a list of strings.

Parameters:	filename (str) – Filename. prefix (str) – The prefix to be inserted to the begining of each item. offset (int) – The offset of lines. max_num (int) – The maximum number of lines to be read, zeros and negatives mean no limitation.
Returns:	A list of strings.
Return type:	list[str]

mmcv.fileio.dict_from_file(filename, key_type=<class 'str'>)[source]¶

Load a text file and parse the content as a dict.

Each line of the text file will be two or more columns splited by whitespaces or tabs. The first column will be parsed as dict keys, and the following columns will be parsed as dict values.

Parameters:	filename (str) – Filename. key_type (type) – Type of the dict’s keys. str is user by default and type conversion will be performed if specified.
Returns:	The parsed contents.
Return type:	dict

image¶

mmcv.image.solarize(img, thr=128)[source]¶

Solarize an image (invert all pixel values above a threshold)

Parameters:	img (ndarray) – Image to be solarized. thr (int) – Threshold for solarizing (0 - 255).
Returns:	The solarized image.
Return type:	ndarray

mmcv.image.posterize(img, bits)[source]¶

Posterize an image (reduce the number of bits for each color channel)

Parameters:	img (ndarray) – Image to be posterized. bits (int) – Number of bits (1 to 8) to use for posterizing.
Returns:	The posterized image.
Return type:	ndarray

mmcv.image.imread(img_or_path, flag='color', channel_order='bgr')[source]¶

Read an image.

Parameters:	img_or_path (ndarray or str or Path) – Either a numpy array or str or pathlib.Path. If it is a numpy array (loaded image), then it will be returned as is. flag (str) – Flags specifying the color type of a loaded image, candidates are color, grayscale and unchanged. Note that the turbojpeg backened does not support unchanged. channel_order (str) – Order of channel, candidates are bgr and rgb.
Returns:	Loaded image array.
Return type:	ndarray

mmcv.image.imwrite(img, file_path, params=None, auto_mkdir=True)[source]¶

Write image to file

Parameters:	img (ndarray) – Image array to be written. file_path (str) – Image file path. params (None or list) – Same as opencv’s `imwrite()` interface. auto_mkdir (bool) – If the parent folder of file_path does not exist, whether to create it automatically.
Returns:	Successful or not.
Return type:	bool

mmcv.image.imfrombytes(content, flag='color', channel_order='bgr')[source]¶

Read an image from bytes.

Parameters:	content (bytes) – Image bytes got from files or other streams. flag (str) – Same as `imread()`.
Returns:	Loaded image array.
Return type:	ndarray

mmcv.image.bgr2gray(img, keepdim=False)[source]¶

Convert a BGR image to grayscale image.

Parameters:	img (ndarray) – The input image. keepdim (bool) – If False (by default), then return the grayscale image with 2 dims, otherwise 3 dims.
Returns:	The converted grayscale image.
Return type:	ndarray

mmcv.image.rgb2gray(img, keepdim=False)[source]¶

Convert a RGB image to grayscale image.

Parameters:	img (ndarray) – The input image. keepdim (bool) – If False (by default), then return the grayscale image with 2 dims, otherwise 3 dims.
Returns:	The converted grayscale image.
Return type:	ndarray

mmcv.image.gray2bgr(img)[source]¶

Convert a grayscale image to BGR image.

Parameters:	img (ndarray) – The input image.
Returns:	The converted BGR image.
Return type:	ndarray

mmcv.image.gray2rgb(img)[source]¶

Convert a grayscale image to RGB image.

Parameters:	img (ndarray) – The input image.
Returns:	The converted BGR image.
Return type:	ndarray

mmcv.image.bgr2rgb(img)¶

Convert a BGR image to RGB image.

Parameters:	img (ndarray or str) – The input image.
Returns:	The converted RGB image.
Return type:	ndarray

mmcv.image.rgb2bgr(img)¶

Convert a RGB image to BGR image.

Parameters:	img (ndarray or str) – The input image.
Returns:	The converted BGR image.
Return type:	ndarray

mmcv.image.bgr2hsv(img)¶

Convert a BGR image to HSV image.

Parameters:	img (ndarray or str) – The input image.
Returns:	The converted HSV image.
Return type:	ndarray

mmcv.image.hsv2bgr(img)¶

Convert a HSV image to BGR image.

Parameters:	img (ndarray or str) – The input image.
Returns:	The converted BGR image.
Return type:	ndarray

mmcv.image.bgr2hls(img)¶

Convert a BGR image to HLS image.

Parameters:	img (ndarray or str) – The input image.
Returns:	The converted HLS image.
Return type:	ndarray

mmcv.image.hls2bgr(img)¶

Convert a HLS image to BGR image.

Parameters:	img (ndarray or str) – The input image.
Returns:	The converted BGR image.
Return type:	ndarray

mmcv.image.iminvert(img)[source]¶

Invert (negate) an image :param img: Image to be inverted. :type img: ndarray

Returns:	The inverted image.
Return type:	ndarray

mmcv.image.imflip(img, direction='horizontal')[source]¶

Flip an image horizontally or vertically.

Parameters:	img (ndarray) – Image to be flipped. direction (str) – The flip direction, either “horizontal” or “vertical”.
Returns:	The flipped image.
Return type:	ndarray

mmcv.image.imflip_(img, direction='horizontal')[source]¶

Inplace flip an image horizontally or vertically. :param img: Image to be flipped. :type img: ndarray :param direction: The flip direction, either “horizontal” or “vertical”. :type direction: str

Returns:	The flipped image(inplace).
Return type:	ndarray

mmcv.image.imrotate(img, angle, center=None, scale=1.0, border_value=0, auto_bound=False)[source]¶

Rotate an image.

Parameters:	img (ndarray) – Image to be rotated. angle (float) – Rotation angle in degrees, positive values mean clockwise rotation. center (tuple) – Center of the rotation in the source image, by default it is the center of the image. scale (float) – Isotropic scale factor. border_value (int) – Border value. auto_bound (bool) – Whether to adjust the image size to cover the whole rotated image.
Returns:	The rotated image.
Return type:	ndarray

mmcv.image.imcrop(img, bboxes, scale=1.0, pad_fill=None)[source]¶

Crop image patches.

3 steps: scale the bboxes -> clip bboxes -> crop and pad.

Parameters:	img (ndarray) – Image to be cropped. bboxes (ndarray) – Shape (k, 4) or (4, ), location of cropped bboxes. scale (float, optional) – Scale ratio of bboxes, the default value 1.0 means no padding. pad_fill (number or list) – Value to be filled for padding, None for no padding.
Returns:	The cropped image patches.
Return type:	list or ndarray

mmcv.image.impad(img, shape, pad_val=0)[source]¶

Pad an image to a certain shape.

Parameters:	img (ndarray) – Image to be padded. shape (tuple) – Expected padding shape. pad_val (number or sequence) – Values to be filled in padding areas.
Returns:	The padded image.
Return type:	ndarray

mmcv.image.impad_to_multiple(img, divisor, pad_val=0)[source]¶

Pad an image to ensure each edge to be multiple to some number.

Parameters:	img (ndarray) – Image to be padded. divisor (int) – Padded image edges will be multiple to divisor. pad_val (number or sequence) – Same as `impad()`.
Returns:	The padded image.
Return type:	ndarray

mmcv.image.imnormalize(img, mean, std, to_rgb=True)[source]¶

Normalize an image with mean and std.

Parameters:	img (ndarray) – Image to be normalized. mean (ndarray) – The mean to be used for normalize. std (ndarray) – The std to be used for normalize. to_rgb (bool) – Whether to convert to rgb.
Returns:	The normalized image.
Return type:	ndarray

mmcv.image.imnormalize_(img, mean, std, to_rgb=True)[source]¶

Inplace normalize an image with mean and std.

Parameters:	img (ndarray) – Image to be normalized. mean (ndarray) – The mean to be used for normalize. std (ndarray) – The std to be used for normalize. to_rgb (bool) – Whether to convert to rgb.
Returns:	The normalized image.
Return type:	ndarray

mmcv.image.imresize(img, size, return_scale=False, interpolation='bilinear', out=None)[source]¶

Resize image to a given size.

Parameters:

img (ndarray) – The input image.
size (tuple) – Target (w, h).
return_scale (bool) – Whether to return w_scale and h_scale.
interpolation (str) – Interpolation method, accepted values are “nearest”, “bilinear”, “bicubic”, “area”, “lanczos”.
out (ndarray) – The output destination.

Returns:

(resized_img, w_scale, h_scale) or: resized_img.

Return type:

tuple or ndarray

mmcv.image.imresize_like(img, dst_img, return_scale=False, interpolation='bilinear')[source]¶

Resize image to the same size of a given image.

Parameters:

img (ndarray) – The input image.
dst_img (ndarray) – The target image.
return_scale (bool) – Whether to return w_scale and h_scale.
interpolation (str) – Same as resize().

Returns:

(resized_img, w_scale, h_scale) or: resized_img.

Return type:

tuple or ndarray

mmcv.image.imrescale(img, scale, return_scale=False, interpolation='bilinear')[source]¶

Resize image while keeping the aspect ratio.

Parameters:	img (ndarray) – The input image. scale (float or tuple[int]) – The scaling factor or maximum size. If it is a float number, then the image will be rescaled by this factor, else if it is a tuple of 2 integers, then the image will be rescaled as large as possible within the scale. return_scale (bool) – Whether to return the scaling factor besides the rescaled image. interpolation (str) – Same as `resize()`.
Returns:	The rescaled image.
Return type:	ndarray

mmcv.image.use_backend(backend)[source]¶

Select a backend for image decoding.

Parameters:	backend (str) – The image decoding backend type. Options are cv2 and turbojpeg (see https://github.com/lilohuang/PyTurboJPEG). turbojpeg is faster but it only supports .jpeg file format.

mmcv.image.rescale_size(old_size, scale, return_scale=False)[source]¶

Calculate the new size to be rescaled to.

Parameters:	old_size (tuple[int]) – The old size of image. scale (float or tuple[int]) – The scaling factor or maximum size. If it is a float number, then the image will be rescaled by this factor, else if it is a tuple of 2 integers, then the image will be rescaled as large as possible within the scale. return_scale (bool) – Whether to return the scaling factor besides the rescaled image size.
Returns:	The new rescaled image size.
Return type:	tuple[int]

video¶

class mmcv.video.VideoReader(filename, cache_capacity=10)[source]¶

Video class with similar usage to a list object.

This video warpper class provides convenient apis to access frames. There exists an issue of OpenCV’s VideoCapture class that jumping to a certain frame may be inaccurate. It is fixed in this class by checking the position after jumping each time. Cache is used when decoding videos. So if the same frame is visited for the second time, there is no need to decode again if it is stored in the cache.

Example:

>>> import mmcv
>>> v = mmcv.VideoReader('sample.mp4')
>>> len(v)  # get the total frame number with `len()`
120
>>> for img in v:  # v is iterable
>>>     mmcv.imshow(img)
>>> v[5]  # get the 6th frame

current_frame()[source]¶

Get the current frame (frame that is just visited).

Returns:	If the video is fresh, return None, otherwise return the frame.
Return type:	ndarray or None

cvt2frames(frame_dir, file_start=0, filename_tmpl='{:06d}.jpg', start=0, max_num=0, show_progress=True)[source]¶

Convert a video to frame images

Parameters:

frame_dir (str) – Output directory to store all the frame images.
file_start (int) – Filenames will start from the specified number.
filename_tmpl (str) – Filename template with the index as the placeholder.
start (int) – The starting frame index.
max_num (int) – Maximum number of frames to be written.
show_progress (bool) – Whether to show a progress bar.

fourcc¶

“Four character code” of the video.

Type:	str

fps¶

FPS of the video.

Type:	float

frame_cnt¶

Total frames of the video.

Type:	int

get_frame(frame_id)[source]¶

Get frame by index.

Parameters:	frame_id (int) – Index of the expected frame, 0-based.
Returns:	Return the frame if successful, otherwise None.
Return type:	ndarray or None

height¶

Height of video frames.

Type:	int

opened¶

Indicate whether the video is opened.

Type:	bool

position¶

Current cursor position, indicating frame decoded.

Type:	int

read()[source]¶

Read the next frame.

If the next frame have been decoded before and in the cache, then return it directly, otherwise decode, cache and return it.

Returns:	Return the frame if successful, otherwise None.
Return type:	ndarray or None

resolution¶

Video resolution (width, height).

Type:	tuple

vcap¶

The raw VideoCapture object.

Type:	`cv2.VideoCapture`

width¶

Width of video frames.

Type:	int

mmcv.video.frames2video(frame_dir, video_file, fps=30, fourcc='XVID', filename_tmpl='{:06d}.jpg', start=0, end=0, show_progress=True)[source]¶

Read the frame images from a directory and join them as a video

Parameters:

frame_dir (str) – The directory containing video frames.
video_file (str) – Output filename.
fps (float) – FPS of the output video.
fourcc (str) – Fourcc of the output video, this should be compatible with the output file type.
filename_tmpl (str) – Filename template with the index as the variable.
start (int) – Starting frame index.
end (int) – Ending frame index.
show_progress (bool) – Whether to show a progress bar.

mmcv.video.convert_video(in_file, out_file, print_cmd=False, pre_options='', **kwargs)[source]¶

Convert a video with ffmpeg.

This provides a general api to ffmpeg, the executed command is:

`ffmpeg -y <pre_options> -i <in_file> <options> <out_file>`

Options(kwargs) are mapped to ffmpeg commands with the following rules:

key=val: “-key val”
key=True: “-key”
key=False: “”

Parameters:	in_file (str) – Input video filename. out_file (str) – Output video filename. pre_options (str) – Options appears before “-i <in_file>”. print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.resize_video(in_file, out_file, size=None, ratio=None, keep_ar=False, log_level='info', print_cmd=False, **kwargs)[source]¶

Resize a video.

Parameters:

in_file (str) – Input video filename.
out_file (str) – Output video filename.
size (tuple) – Expected size (w, h), eg, (320, 240) or (320, -1).
ratio (tuple or float) – Expected resize ratio, (2, 0.5) means (w*2, h*0.5).
keep_ar (bool) – Whether to keep original aspect ratio.
log_level (str) – Logging level of ffmpeg.
print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.cut_video(in_file, out_file, start=None, end=None, vcodec=None, acodec=None, log_level='info', print_cmd=False, **kwargs)[source]¶

Cut a clip from a video.

Parameters:

in_file (str) – Input video filename.
out_file (str) – Output video filename.
start (None or float) – Start time (in seconds).
end (None or float) – End time (in seconds).
vcodec (None or str) – Output video codec, None for unchanged.
acodec (None or str) – Output audio codec, None for unchanged.
log_level (str) – Logging level of ffmpeg.
print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.concat_video(video_list, out_file, vcodec=None, acodec=None, log_level='info', print_cmd=False, **kwargs)[source]¶

Concatenate multiple videos into a single one.

Parameters:	video_list (list) – A list of video filenames out_file (str) – Output video filename vcodec (None or str) – Output video codec, None for unchanged acodec (None or str) – Output audio codec, None for unchanged log_level (str) – Logging level of ffmpeg. print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.flowread(flow_or_path, quantize=False, concat_axis=0, *args, **kwargs)[source]¶

Read an optical flow map.

Parameters:	flow_or_path (ndarray or str) – A flow map or filepath. quantize (bool) – whether to read quantized pair, if set to True, remaining args will be passed to `dequantize_flow()`. concat_axis (int) – The axis that dx and dy are concatenated, can be either 0 or 1. Ignored if quantize is False.
Returns:	Optical flow represented as a (h, w, 2) numpy array
Return type:	ndarray

mmcv.video.flowwrite(flow, filename, quantize=False, concat_axis=0, *args, **kwargs)[source]¶

Write optical flow to file.

If the flow is not quantized, it will be saved as a .flo file losslessly, otherwise a jpeg image which is lossy but of much smaller size. (dx and dy will be concatenated horizontally into a single image if quantize is True.)

Parameters:	flow (ndarray) – (h, w, 2) array of optical flow. filename (str) – Output filepath. quantize (bool) – Whether to quantize the flow and save it to 2 jpeg images. If set to True, remaining args will be passed to `quantize_flow()`. concat_axis (int) – The axis that dx and dy are concatenated, can be either 0 or 1. Ignored if quantize is False.

mmcv.video.quantize_flow(flow, max_val=0.02, norm=True)[source]¶

Quantize flow to [0, 255].

After this step, the size of flow will be much smaller, and can be dumped as jpeg images.

Parameters:	flow (ndarray) – (h, w, 2) array of optical flow. max_val (float) – Maximum value of flow, values beyond [-max_val, max_val] will be truncated. norm (bool) – Whether to divide flow values by image width/height.
Returns:	Quantized dx and dy.
Return type:	tuple[ndarray]

mmcv.video.dequantize_flow(dx, dy, max_val=0.02, denorm=True)[source]¶

Recover from quantized flow.

Parameters:	dx (ndarray) – Quantized dx. dy (ndarray) – Quantized dy. max_val (float) – Maximum value used when quantizing. denorm (bool) – Whether to multiply flow values with width/height.
Returns:	Dequantized flow.
Return type:	ndarray

mmcv.video.flow_warp(img, flow, filling_value=0, interpolate_mode='nearest')[source]¶

Use flow to warp img

Parameters:	img (ndarray, float or uint8) – Image to be warped. flow (ndarray, float) – Optical Flow. filling_value (int) – The missing pixels will be set with filling_value. interpolate_mode (str) – bilinear -> Bilinear Interpolation; nearest -> Nearest Neighbor.
Returns:	Warped image with the same shape of img
Return type:	ndarray

arraymisc¶

mmcv.arraymisc.quantize(arr, min_val, max_val, levels, dtype=<sphinx.ext.autodoc.importer._MockObject object>)[source]¶

Quantize an array of (-inf, inf) to [0, levels-1].

Parameters:	arr (ndarray) – Input array. min_val (scalar) – Minimum value to be clipped. max_val (scalar) – Maximum value to be clipped. levels (int) – Quantization levels. dtype (np.type) – The type of the quantized array.
Returns:	Quantized array.
Return type:	tuple

mmcv.arraymisc.dequantize(arr, min_val, max_val, levels, dtype=<sphinx.ext.autodoc.importer._MockObject object>)[source]¶

Dequantize an array.

Parameters:	arr (ndarray) – Input array. min_val (scalar) – Minimum value to be clipped. max_val (scalar) – Maximum value to be clipped. levels (int) – Quantization levels. dtype (np.type) – The type of the dequantized array.
Returns:	Dequantized array.
Return type:	tuple

visualization¶

class mmcv.visualization.Color[source]¶

An enum that defines common colors.

Contains red, green, blue, cyan, yellow, magenta, white and black.

mmcv.visualization.color_val(color)[source]¶

Convert various input to color tuples.

Parameters:	color (`Color`/str/tuple/int/ndarray) – Color inputs
Returns:	A tuple of 3 integers indicating BGR channels.
Return type:	tuple[int]

mmcv.visualization.imshow(img, win_name='', wait_time=0)[source]¶

Show an image.

Parameters:	img (str or ndarray) – The image to be displayed. win_name (str) – The window name. wait_time (int) – Value of waitKey param.

mmcv.visualization.imshow_bboxes(img, bboxes, colors='green', top_k=-1, thickness=1, show=True, win_name='', wait_time=0, out_file=None)[source]¶

Draw bboxes on an image.

Parameters:

img (str or ndarray) – The image to be displayed.
bboxes (list or ndarray) – A list of ndarray of shape (k, 4).
colors (list[str or tuple or Color]) – A list of colors.
top_k (int) – Plot the first k bboxes only if set positive.
thickness (int) – Thickness of lines.
show (bool) – Whether to show the image.
win_name (str) – The window name.
wait_time (int) – Value of waitKey param.
out_file (str, optional) – The filename to write the image.

mmcv.visualization.imshow_det_bboxes(img, bboxes, labels, class_names=None, score_thr=0, bbox_color='green', text_color='green', thickness=1, font_scale=0.5, show=True, win_name='', wait_time=0, out_file=None)[source]¶

Draw bboxes and class labels (with scores) on an image.

Parameters:

img (str or ndarray) – The image to be displayed.
bboxes (ndarray) – Bounding boxes (with scores), shaped (n, 4) or (n, 5).
labels (ndarray) – Labels of bboxes.
class_names (list[str]) – Names of each classes.
score_thr (float) – Minimum score of bboxes to be shown.
bbox_color (str or tuple or Color) – Color of bbox lines.
text_color (str or tuple or Color) – Color of texts.
thickness (int) – Thickness of lines.
font_scale (float) – Font scales of texts.
show (bool) – Whether to show the image.
win_name (str) – The window name.
wait_time (int) – Value of waitKey param.
out_file (str or None) – The filename to write the image.

mmcv.visualization.flowshow(flow, win_name='', wait_time=0)[source]¶

Show optical flow.

Parameters:	flow (ndarray or str) – The optical flow to be displayed. win_name (str) – The window name. wait_time (int) – Value of waitKey param.

mmcv.visualization.flow2rgb(flow, color_wheel=None, unknown_thr=1000000.0)[source]¶

Convert flow map to RGB image.

Parameters:	flow (ndarray) – Array of optical flow. color_wheel (ndarray or None) – Color wheel used to map flow field to RGB colorspace. Default color wheel will be used if not specified. unknown_thr (str) – Values above this threshold will be marked as unknown and thus ignored.
Returns:	RGB image that can be visualized.
Return type:	ndarray

mmcv.visualization.make_color_wheel(bins=None)[source]¶

Build a color wheel.

Parameters:	bins (list or tuple, optional) – Specify the number of bins for each color range, corresponding to six ranges: red -> yellow, yellow -> green, green -> cyan, cyan -> blue, blue -> magenta, magenta -> red. [15, 6, 4, 11, 13, 6] is used for default (see Middlebury).
Returns:	Color wheel of shape (total_bins, 3).
Return type:	ndarray

utils¶

class mmcv.utils.Config(cfg_dict=None, cfg_text=None, filename=None)[source]¶

A facility for config and config files.

It supports common file formats as configs: python/json/yaml. The interface is the same as a dict object and also allows access config values as attributes.

Example

>>> cfg = Config(dict(a=1, b=dict(b1=[0, 1])))
>>> cfg.a
1
>>> cfg.b
{'b1': [0, 1]}
>>> cfg.b.b1
[0, 1]
>>> cfg = Config.fromfile('tests/data/config/a.py')
>>> cfg.filename
"/home/kchen/projects/mmcv/tests/data/config/a.py"
>>> cfg.item4
'test'
>>> cfg
"Config [path: /home/kchen/projects/mmcv/tests/data/config/a.py]: "
"{'item1': [1, 2], 'item2': {'a': 0}, 'item3': True, 'item4': 'test'}"

static auto_argparser(description=None)[source]¶: Generate argparser from config file automatically (experimental)

merge_from_dict(options)[source]¶

Merge list into cfg_dict

Merge the dict parsed by MultipleKVAction into this cfg. Example,

>>> options = {'model.backbone.depth': 50}
>>> cfg = Config(dict(model=dict(backbone=dict(type='ResNet'))))
>>> cfg.merge_from_dict(options)

Parameters:	options (dict) – dict of configs to merge from.

class mmcv.utils.ConfigDict(*args, **kwargs)[source]¶

mmcv.utils.get_logger(name, log_file=None, log_level=20)[source]¶

Initialize and get a logger by name.

If the logger has not been initialized, this method will initialize the logger by adding one or two handlers, otherwise the initialized logger will be directly returned. During initialization, a StreamHandler will always be added. If log_file is specified and the process rank is 0, a FileHandler will also be added.

Parameters:	name (str) – Logger name. log_file (str \| None) – The log filename. If specified, a FileHandler will be added to the logger. log_level (int) – The logger level. Note that only the process of rank 0 is affected, and other processes will set the level to “Error” thus be silent most of the time.
Returns:	The expected logger.
Return type:	logging.Logger

mmcv.utils.print_log(msg, logger=None, level=20)[source]¶

Print a log message.

Parameters:

msg (str) – The message to be logged.
logger (logging.Logger | str | None) – The logger to be used. Some special loggers are: - “silent”: no message will be printed. - other str: the logger obtained with get_root_logger(logger). - None: The print() method will be used to print log messages.
level (int) – Logging level. Only available when logger is a Logger object or “root”.

mmcv.utils.is_str(x)[source]¶

Whether the input is an string instance.

Note: This method is deprecated since python 2 is no longer supported.

mmcv.utils.iter_cast(inputs, dst_type, return_type=None)[source]¶

Cast elements of an iterable object into some type.

Parameters:	inputs (Iterable) – The input object. dst_type (type) – Destination type. return_type (type, optional) – If specified, the output object will be converted to this type, otherwise an iterator.
Returns:	The converted object.
Return type:	iterator or specified type

mmcv.utils.list_cast(inputs, dst_type)[source]¶

Cast elements of an iterable object into a list of some type.

A partial method of iter_cast().

mmcv.utils.tuple_cast(inputs, dst_type)[source]¶

Cast elements of an iterable object into a tuple of some type.

A partial method of iter_cast().

mmcv.utils.is_seq_of(seq, expected_type, seq_type=None)[source]¶

Check whether it is a sequence of some type.

Parameters:	seq (Sequence) – The sequence to be checked. expected_type (type) – Expected type of sequence items. seq_type (type, optional) – Expected sequence type.
Returns:	Whether the sequence is valid.
Return type:	bool

mmcv.utils.is_list_of(seq, expected_type)[source]¶

Check whether it is a list of some type.

A partial method of is_seq_of().

mmcv.utils.is_tuple_of(seq, expected_type)[source]¶

Check whether it is a tuple of some type.

A partial method of is_seq_of().

mmcv.utils.slice_list(in_list, lens)[source]¶

Slice a list into several sub lists by a list of given length.

Parameters:	in_list (list) – The list to be sliced. lens (int or list) – The expected length of each out list.
Returns:	A list of sliced list.
Return type:	list

mmcv.utils.concat_list(in_list)[source]¶

Concatenate a list of list into a single list.

Parameters:	in_list (list) – The list of list to be merged.
Returns:	The concatenated flat list.
Return type:	list

mmcv.utils.check_prerequisites(prerequisites, checker, msg_tmpl='Prerequisites "{}" are required in method "{}" but not found, please install them first.')[source]¶

A decorator factory to check if prerequisites are satisfied.

Parameters:	prerequisites (str of list[str]) – Prerequisites to be checked. checker (callable) – The checker method that returns True if a prerequisite is meet, False otherwise. msg_tmpl (str) – The message template with two variables.
Returns:	A specific decorator.
Return type:	decorator

mmcv.utils.requires_package(prerequisites)[source]¶

A decorator to check if some python packages are installed.

Example

>>> @requires_package('numpy')
>>> func(arg1, args):
>>>     return numpy.zeros(1)
array([0.])
>>> @requires_package(['numpy', 'non_package'])
>>> func(arg1, args):
>>>     return numpy.zeros(1)
ImportError

mmcv.utils.requires_executable(prerequisites)[source]¶

A decorator to check if some executable files are installed.

Example

>>> @requires_executable('ffmpeg')
>>> func(arg1, args):
>>>     print(1)
1

mmcv.utils.scandir(dir_path, suffix=None, recursive=False)[source]¶

Scan a directory to find the interested files.

Parameters:	(str \| obj (dir_path) – Path): Path of the directory. suffix (str \| tuple(str), optional) – File suffix that we are interested in. Default: None. recursive (bool, optional) – If set to True, recursively scan the directory. Default: False.
Returns:	A generator for all the interested files with relative pathes.

class mmcv.utils.ProgressBar(task_num=0, bar_width=50, start=True, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶: A progress bar which can print the progress

mmcv.utils.track_progress(func, tasks, bar_width=50, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, **kwargs)[source]¶

Track the progress of tasks execution with a progress bar.

Tasks are done with a simple for-loop.

Parameters:	func (callable) – The function to be applied to each task. tasks (list or tuple[Iterable, int]) – A list of tasks or (tasks, total num). bar_width (int) – Width of progress bar.
Returns:	The task results.
Return type:	list

mmcv.utils.track_iter_progress(tasks, bar_width=50, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, **kwargs)[source]¶

Track the progress of tasks iteration or enumeration with a progress bar.

Tasks are yielded with a simple for-loop.

Parameters:	tasks (list or tuple[Iterable, int]) – A list of tasks or (tasks, total num). bar_width (int) – Width of progress bar.
Yields:	list – The task results.

mmcv.utils.track_parallel_progress(func, tasks, nproc, initializer=None, initargs=None, bar_width=50, chunksize=1, skip_first=False, keep_order=True, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶

Track the progress of parallel task execution with a progress bar.

The built-in multiprocessing module is used for process pools and tasks are done with Pool.map() or Pool.imap_unordered().

Parameters:	func (callable) – The function to be applied to each task. tasks (list or tuple[Iterable, int]) – A list of tasks or (tasks, total num). nproc (int) – Process (worker) number. initializer (None or callable) – Refer to `multiprocessing.Pool` for details. initargs (None or tuple) – Refer to `multiprocessing.Pool` for details. chunksize (int) – Refer to `multiprocessing.Pool` for details. bar_width (int) – Width of progress bar. skip_first (bool) – Whether to skip the first sample for each worker when estimating fps, since the initialization step may takes longer. keep_order (bool) – If True, `Pool.imap()` is used, otherwise `Pool.imap_unordered()` is used.
Returns:	The task results.
Return type:	list

class mmcv.utils.Registry(name)[source]¶

A registry to map strings to classes.

Parameters:	name (str) – Registry name.

get(key)[source]¶

Get the registry record.

Parameters:	key (str) – The class name in string format.
Returns:	The corresponding class.
Return type:	class

register_module(cls=None, force=False)[source]¶

Register a module.

A record will be added to self._module_dict, whose key is the class name and value is the class itself. It can be used as a decorator or a normal function.

Example

>>> backbones = Registry('backbone')
>>> @backbones.register_module
>>> class ResNet(object):
>>>     pass

Example

>>> backbones = Registry('backbone')
>>> class ResNet(object):
>>>     pass
>>> backbones.register_module(ResNet)

Parameters:	module (`nn.Module`) – Module to be registered. force (bool, optional) – Whether to override an existing class with the same name. Default: False.

mmcv.utils.build_from_cfg(cfg, registry, default_args=None)[source]¶

Build a module from config dict.

Parameters:	cfg (dict) – Config dict. It should at least contain the key “type”. registry (`Registry`) – The registry to search the type from. default_args (dict, optional) – Default initialization arguments.
Returns:	The constructed object.
Return type:	obj

class mmcv.utils.Timer(start=True, print_tmpl=None)[source]¶

A flexible Timer class.

Example:

>>> import time
>>> import mmcv
>>> with mmcv.Timer():
>>>     # simulate a code block that will run for 1s
>>>     time.sleep(1)
1.000
>>> with mmcv.Timer(print_tmpl='it takes {:.1f} seconds'):
>>>     # simulate a code block that will run for 1s
>>>     time.sleep(1)
it takes 1.0 seconds
>>> timer = mmcv.Timer()
>>> time.sleep(0.5)
>>> print(timer.since_start())
0.500
>>> time.sleep(0.5)
>>> print(timer.since_last_check())
0.500
>>> print(timer.since_start())
1.000

is_running¶

indicate whether the timer is running

Type:	bool

since_last_check()[source]¶

Time since the last checking.

Either since_start() or since_last_check() is a checking operation.

Returns (float): Time in seconds.

since_start()[source]¶

Total time since the timer is started.

Returns (float): Time in seconds.

start()[source]¶: Start the timer.

exception mmcv.utils.TimerError(message)[source]¶

mmcv.utils.check_time(timer_id)[source]¶

Add check points in a single line.

This method is suitable for running a task on a list of items. A timer will be registered when the method is called for the first time.

Example:

>>> import time
>>> import mmcv
>>> for i in range(1, 6):
>>>     # simulate a code block
>>>     time.sleep(i)
>>>     mmcv.check_time('task1')
2.000
3.000
4.000
5.000

Parameters:	timer_id (str) – Timer identifier.

API¶

fileio¶

image¶

video¶

arraymisc¶

visualization¶

utils¶

runner¶