performance - approaches to speed up scientific computations on server accessible by internet users -
I am interested in any traditional knowledge about how to contact the following problem. Note that I am a hardware man, so be careful with industry knowledge / vocabulary / quantities.
I am providing an online application that includes very complex math computations, such as fast-fourier conversion, which includes nested for-loops and very large data airways (every 1.6 GB). Users on the Internet will sign in with this app, enter some custom parameters, and submit a job that calls these math computing. To allow the user to wait at least, and to allow multiple independent sessions for multiple users (each user with a different thread), I am thinking how I can increase the speed of math computation, I hope this is an obstacle.
I do not see too much to advise on how to structure this program (like using floating data type, small arrays, etc. whenever possible, instead of floating. ), But I am interested once the program is completed, what can be done to speed things up.
For example, how to ensure multiple cores in the CPU are automatically accessed based on demand? (Is it done by default or do I need to manage any kind of process?
Or, how to do parallel processing (break the loop between multiple cores and / or machines)
Any practical advice is highly appreciated. I am sure that I am not the first to require it, so I am hoping that the scale with demand But the industry is the best available Thanksgiving!
Thanks in advance!
The FFT methods are very parallel. Especially in multi dimensions.
There are classical implementations.
An approach (based on available hardware) is a pool of thread thread (or process, depending on the configuration ).
At my work, we have a lot of success with a PC and simple-as much data packet as a PC Get by Q, encodes (in multicore) and the user is sent back.
Try math stuff Micro-friendly, instead using one of the above libraries. Focus on designing packets, implementing q computation (do not forget any kind of quota / preferences), to ensure that the computed data is reliably sent back to the thread, which is to be included in the packet .
Hardware (huge SMP computers or PC farms), problems are different.
(If you have options, go for PC farms.)
Edit: You can open OpenMP to automatically parrot loops For PC farms, they provide the advantage over the big calculator from the point of view of a flexibility: they scale well, they are not very expensive, and they have bought them efficiently / sold / again : may be used. Linux is probably a good option, but it depends on which environment you are comfortable with.
Unfortunately, I have to say that there is no good library to distribute computational requests reliably and efficiently on PC fields. The problem is quite difficult because you should have an account for breakdown, network communication, mob, distribution of processes ...
Comments
Post a Comment