Brain Bit Precision Int32 FP32, Int16 FP16, Int8 FP8, Int6 FP6, Int4? Idealness of Computational Machine Learning ML TOPS for the human brain:
Brain level Int/Float inferencing is ideally in Int8/7 with error bits or float remainders
Comparison List : RS
48Bit Int+Float Int48+FP48 (many connections, Eyes for example) HDR Vison
40BitInt+Float Int40+FP40 HDR Basic
Int16 FP32
Int8 Float16(2Channel, Brain Node)(3% Brain Study)
Int7 (20% Brain Study)
Int6 (80% Brain Study)
Int5 (Wolves (some are 6+))
Int4 (Sheep & worms)
Int3 (Germ biosystems)
Statistically a science test stated 80% of brains in man quantify Bit at 6 20% to 7Bit
XBox X & PlayStation 5 do down to INT4Bit (quite likely for quick inferencing)
Be aware that using 4 bit Int instructions .. potentially means more instructions used per clock cycle & more micro data transfers..
Int8 is most commonly liable to quantify data with minimum error in 8Bit like the Atari STE or the Nintendo 8Bit..
Colour perception for example is many orders of magnitude higher! Or 8bit colours EGA is all we would use..
16Bit was not good enough.. But 32Bit suites most people! But 10Bit(x4) 40Bit & Dolby 12Bit(x4) 48Bit is a luxury & we love it!
Precision Quality Control in ML:
While nothing is sure, Human beings appear to have Integer of around 8 & are more surely able to practice Float units,
Bundling is when multiple Neuron roots go to the same neuron in Sync from the same response cluster Neurons,
This feature enhances data integrity & precision by multiplying data transfer & response precision..
Memory & Maths calculations.
(c)Rupert S https://is.gd/ProcessorLasso
ML Classification Bundling for HIM & Her
Sorting bundles in priorities such as,Time to process, Similarity & by probability (likelihood) improves perception & thought process,
Logical sort orders..
Required processing order based on sorted requirements (one needs another)
Items that go locally together, { Cleaning, Cooking, cleanup }
Logical order, { Drink, Power, Computer, Application, Search, Webpage, Notebook, read, write }
Saving data caches it & aids processing; But organising it first makes retrieval clean & thought Clean Meditation Logic.
Human Brain cells have 1000 connections, squid 10000; Each connection does:
7Bit regular
8Bit, sharp
9Bit on better effort,
10Bit on clarity & meditation + work hard
6Bit on relaxed,
5Bit on drunk
Connections for dedicated skills such as maths have:
Dedication bundling (multiple connections)
Multiple Affirmations, A-Synchronous, Synchronous
1 5Bit to 7Bit
2, 5Bit to 18Bit
3, 7Bit to 26Bit
4, 16Bit to 38Bit
5, 17Bit to 48Bit
Eyes for example can bundle 5 on training, colour purity..
lower bundling offers more flexibility,
High bundling offers assurety & speed & retention.
RS
https://is.gd/DictionarySortJS
Quantization modelling : RS : Physics III Slit Experiment
"(SmoothQuant).The optimized model achieves >3X latency improvement with a custom dequantization kernel for FP16 inference. Although the work does not map to Int8 engine"
In view that inferencing is being activated in Int4 & Int8 & Int16 & Floats f16b F8 & F4,
Now my view is a vision of a Slit experiment in Physics; Now a slit experiment shows light photos in slices through a screen..
Int4 IIII < Int8 IIIIIIII < Int16 IIIIIIIIIIIIIIII
Ratio 1:2:4 on contained knowledge
Minimal Origin of mankind's knowledge : IIII < IIIIIIII < IIIIIIIIIIIIIIII Defined Summit of all power
My method is to compress the point node data with
https://is.gd/WaveletAutoEncoder
https://github.com/GPUOpen-LibrariesAndSDKs/brotli_g_sdk
So what we do is take advantage of patterns; Creating tables of 1111 1010 as examples; These compress well & can be short noted as patterns,
We can expand 4Bit into 8Bit inference & compress as patterns; The total data point is 4Bit if it is a pattern,
The subject is not predictable unless we pick the patterns!
We can however Quantize the memory footprint; The Double/Single precision operations may be faster! :L
We need the models to work in F16 & Int8 & Int4 after-all, But i see a reason to use Floats because sub-quantization does leave a remainder for us to compare..
That relevant 'F16' >=-
RS
Study Subject Reduction :
https://science.n-helix.com/2021/03/brain-bit-precision-int32-fp32-int16.html
https://science.n-helix.com/2022/10/ml.html
https://blog.openvino.ai/blog-posts/q123-technology-update-low-precision-and-model-optimization
https://blog.openvino.ai/blog-posts/q223-technology-update-low-precision-and-model-optimization
https://blog.openvino.ai/blog-posts/q323-technology-update-low-precision-and-model-optimization
https://blog.openvino.ai/blog-posts/q423-technology-update-low-precision-and-model-optimization
Batch Size 240W>65W, 32GB{64, 16}, 15W>5W, 4gb{16, 1} : 16, 8, 4 seems optimal,
Time taken compatible:
ML_With_USB_Stress-Testing_USB_Accelerators_for_Efficient_Edge
https://www.researchgate.net/publication/377174200_Stress-Testing_USB_Accelerators_for_Efficient_Edge_Inference
Python & JS Configurations
https://is.gd/DictionarySortJS
*
Restricted Boltzmann ML Networks : Brain Efficient
For example this works well with fonts & web browsers & consoles or standard input display hubs or User Interfaces, UI & JS & Webpage code.
In the old days photo applications did exist to use ML Image enhancement on older processors..
So how do they exploit Machine Learning on hardware with MMX for example ?
Procedural process data analytics:
Converting large statistics data bases; On general Tessellation/Interpolation of images
The procedural element is writing the code that interpolates data based upon the statistics database...
Associated colours..
Face identity...
Linearity or curvature...
Association of grain & texture...
Databases get large fast & a 2 MB to 15MB Database makes the most sense...
Averages have to be categorized by either being worthy of 2 Places in the database or an average..
You can still run ML on a database object & then the points in the table are called nodes!
Indeed you can do both, However database conversion makes datasets way more manageable to run within the SiMD & AVX feature-set.
However the matter of inferencing then has to be reduced to statistical averages & sometimes ML runs fine inferencing this way.
Both ways work, Whatever is best for you & the specific hardware.
(c)Rupert S
DL-ML slide : Machine Learning DL-ML
https://is.gd/LEDSource
Python & JS Configurations
https://is.gd/DictionarySortJS
https://iopscience.iop.org/article/10.1088/1741-4326/ad142f
https://is.gd/TokmaML
No comments:
Post a Comment