Every sensor in the array is sampling at frequency, so - first order - you can use that sampling frequency and the sample size, you get an idea of the input bandwidth in bytes/second. There are of course bandwidth reduction steps (filtering, downsampling, beamforming)...
Sorry, not sure I follow from what I said (explaining how much data sensors produce) to 'increasing the sampling frequency' ? You're usually sampling at larger width to then put specifically taylored pass-band filter and removing aliasing effects and then downsampling. This is a classic signal acquisition pattern : https://dsp.stackexchange.com/questions/63359/obtain-i-q-com...
None of this changes the actual real amount of data you have at the end of the day though after all is said and done, that's what I mean, so long as you don't botch it and capture too little. In computing terms, the amount of real data in a compressed archive and the uncompressed original is the same, even if the file size is larger for the latter.
On SKA from what I understand they're sampling broadband but quickly beamform and downsample as the datarates would be unsustainable to store over the whole array.
Right, that makes sense, you'd be looking at an insane amount of data across the ranges that these sensors can look at. But they would still need to preserve phase information if they want to use the array for what it is best at and that alone is a massive amount of data.
I think they preserve timestamped I,Q data. Know some people looking at down-sampling, preselecting those signals for longer term storage and deeper reprocessing and they seem to have a 24h window to 'analyze and keep what you need'.
We're still in technological phase where ADCs are far more advanced than storage and online processing systems, which means throwing away a lot. But I have high hopes for a system where you upgrade computing, network, storage (and maybe ADCs...) and you get an improved sensor. Throw man-hours at some GPU kernel developers and you get new science. The limit seems more now about enough people and compute to fully exploit the data than technological...
Too late to edit: any idea of the resolution that the I,Q data is sampled at (bandwidth, bit depth)? I've been in one of these installations a while ago and the tourguide had really no clue about any of the details (I think he was the son of one of the scientists)?
BTW if you're interested in the concept of upgrading a sensor without retooling the RF part, and the impact of 'just' putting new COTS racked server hardware and engineering man-hours to get a 'new' sensor with new capabilities, have a look at Julien Plante's work on NenuFAR (which isn't like the SKA at all :-) : https://cnrs.hal.science/USN/obspm-04273804v1 . Damien Gratadour, his PhD supervisor is an amazing technologist, dedicated to improving astronomy instruments, and I was very lucky to work with him and his team... the things the French can string together with small teams and thin budgets...