Friday 28 April 2017

Get download URL from html source to download file from content-disposition

I'm trying to download a file with Python from a site. The issue is the download automatically starts after submitting the form on the page. Using Mechanize, I am able to log in, get to the page where the download lives, fill out the form, and submit the form (which kicks off the download of an xls file).

Looking in content-disposition, I can see attachment name:

attachment {'filename': 'policytransactions.xls'}

but I cannot figure out how to download this file locally.

Looking at the page source, I can see that the answer to my question is somewhere in here:

<td><div id="form1:j_idt37" class="ui-datatable ui-widget dataTable"><table role="grid"><thead><tr role="row"><th id="form1:j_idt37:j_idt38" class="ui-state-default" role="columnheader"><div class="ui-dt-c"><span></span></div></th></tr></thead><tfoot></tfoot><tbody id="form1:j_idt37_data" class="ui-datatable-data ui-widget-content"><tr data-ri="0" class="ui-widget-content ui-datatable-even" role="row"><td role="gridcell"><div class="ui-dt-c">
<script type="text/javascript" src="/policy/app/javax.faces.resource/jsf.js?ln=javax.faces"></script>
<a href="#" onclick="mojarra.jsfcljs(document.getElementById('form1'),{'form1:j_idt37:0:j_idt39':'form1:j_idt37:0:j_idt39','format':'xls'},'');return false" class="commandLink"><span class="outputText">XLS</span></a></div></td></tr></tbody></table></div><script id="form1:j_idt37_s" type="text/javascript">PrimeFaces.cw('DataTable','widget_form1_j_idt37',{id:'form1:j_idt37'});</script></td>
<td><table>
<tbody>
<tr>
<td><span id="form1:dateField3"><input id="form1:dateField3_input" name="form1:dateField3_input" type="text" value="04/01/2017" class="ui-inputfield ui-widget ui-state-default ui-corner-all" /></span><script id="form1:dateField3_s" type="text/javascript">$(function(){PrimeFaces.cw('Calendar','widget_form1_dateField3',{id:'form1:dateField3',popup:true,locale:'en_US',dateFormat:'mm/dd/yy',defaultDate:'04/01/2017'});});</script></td>
<td><span id="form1:dateField4"><input id="form1:dateField4_input" name="form1:dateField4_input" type="text" value="04/28/2017" class="ui-inputfield ui-widget ui-state-default ui-corner-all" /></span><script id="form1:dateField4_s" type="text/javascript">$(function(){PrimeFaces.cw('Calendar','widget_form1_dateField4',{id:'form1:dateField4',popup:true,locale:'en_US',dateFormat:'mm/dd/yy',defaultDate:'04/28/2017'});});</script></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<input type="hidden" name="javax.faces.ViewState" id="javax.faces.ViewState" value="e2s1" />
</form>
</div>

Any suggestions on how to grab this? Thanks



via Adam Makharita

No comments:

Post a Comment