Premises:
Names table and patterns table fixed. All patterns 00001111. Will be changed only color table.
There are 59736 cycles per frame on NTSC.
One OUTI takes 18 cycles (Z80 + M1), so I can push around 3300 bytes per frame.
3 frames = 9900 bytes
Entire color table = 32 x 192 = 6144 bytes
Sprite pattern table for 32 16x16 sprites = 32 x 32 = 1024 bytes
Sprite color table for the 32 sprites = 32 x 16 = 512 bytes
Sprite attributes table for 32 sprites = 32 x 4 = 128 bytes
Total = 7808 bytes, fitting well inside 3 frames
The idea is using the 32 16x16 sprites (magnified, so covering a 32x32 area) to complete the image where it is most needed.
This way I will have a frame of a resolution of 64x192, plus a sprite layer covering 67% of the screen to improve the resolution, especially horizontal. The sprites can cover an area of 256x128 pixels, of course with big (2x2) pixels.
They will be positioned at each frame were it make more sense to compose a better picture. That's why it's necessary to blit the Sprite Attr table also.
All these data will be pre calculated on PC and put in a big MegaROM. The MSX assembly code will just read and spit bytes to VRAM. Double buffering with 2 pages of VRAM (there are 8 on screen 4). Pretty simple code on MSX side.
Each second of video will use 20 x 7808 = 156 kb, a full 256 page MegaROM (4MB) will store 25 seconds of video.
Sounds like a cool idea?